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Abstract 

Ever since entanglement was identified as a computational and cryptographic resource, 
effort has been made to find an efficient way to tell whether a given density matrix represents 
an unentangled, or separable, state. Essentially, this is the quantum separability problem. 

In Chapter I begin with a brief introduction to quantum states, entanglement, and 
a basic formal definition of the quantum separability problem. I conclude the first chapter 
with a summary of one-sided tests for separability, including those involving semidefinite 
programming. 

In Chapter I apply polyhedral theory to prove easily that the set of separable states 
is not a polytope; for the sake of completeness, I then review the role of polytopes in non- 
locality. Next, I give a novel treatment of entanglement witnesses and define a new class of 
entanglement witnesses, which may prove to be useful beyond the examples given. In the last 
section, I briefly review the five basic convex body problems given in 0], and their application 
to the quantum separability problem. 

In Chapter |31 I treat the separability problem as a computational decision problem and 
motivate its approximate formulations. After a review of basic complexity-theoretic notions, 
I discuss the computational complexity of the separability problem: I discuss the issue of 
NP-completeness, giving an alternative definition of the separability problem as an NP-hard 
problem in NP. I finish the chapter with a comprehensive survey of deterministic algorithmic 
solutions to the separability problem, including one that follows from a second NP formulation. 

Chapters f to 3 motivate a new interior-point algorithm which, given the expected values 
of a subset of an orthogonal basis of observables of an otherwise unknown quantum state, 
searches for an entanglement witness in the span of the subset of observables. When all the 
expected values are known, the algorithm solves the separability problem. In Chapter I 
give the motivation for the algorithm and show how it can be used in a particular physical 
scenario to detect entanglement (or decide separability) of an unknown quantum state using 
as few quantum resources as possible. I then explain the intuitive idea behind the algorithm 
and relate it to the standard algorithms of its kind. I end the chapter with a comparison of 
the complexities of the algorithms surveyed in Chapter 3. Finally, in Chapter I present the 
details of the algorithm and discuss its performance relative to standard methods. 
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This work attempts to give a comprehensive treatment of the state of the art in determin- 
istic algorithms for the quantum separability problem in the finite-dimensional and bipartite 
case. The need for such a treatment stems from the very recent (2003 and later) proposals 
for separability algorithms - all quite different from one another. It is likely that these recent 
papers emerged when they did because of the (disheartening) result of Gurvits (2001) showing 
the problem to be computationally intractable: given that the problem is hard, what is the 
best we can do to solve it? Among these proposals is my algorithm (done in collaboration), 
which will be shown to compare favorably to the others, complexity-theoretically. 

Gurvits' result, that the separability problem is NP-hard, raised a question among the 
quantum information community: "...but then isn't it NP-complete?" After hearing many 
people ask this question, I set out to clarify the issue and show that the separability problem 
is NP-complete in the usual sense (that is, with respect to Karp reductions). The latter 
part of this mission is as yet unsuccessful, but the partial results are presented, including 
a redefining of the separability problem as an NP-hard problem in NP (previous definitions 
could not place the problem in NP, rather only in a modified version of NP). 

Entanglement witnesses have been around since 1996, and had been extensively studied up 
until recently, especially by the Innsbruck-Hannover group, which produced interesting char- 
acterisations of entanglement witnesses and showed how to construct optimal entanglement 
witnesses. I approached entanglement witnesses from the viewpoint of polyhedral theory, 
rather than linear-operator theory. The result was the immediate solution of an open prob- 
lem of whether the separable states form a polytope. Under a slightly different definition of 
"entanglement witness" , I discover a new class of entanglement witnesses which I call "am- 
bidextrous entanglement witnesses" . These correspond to observables whose expected values 
can indicate that a state is entangled on opposite sides of the set of separable states. 
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Chapter 1 
Introduction 



"Just because it's hard, it doesn't mean you don't try." When my mother said these words 
to me way back when I was a Master's student, I had no idea they would open my PhD thesis. 

Ever since quantum-mechanical phenomena were identified as computational and crypto- 
graphic resources, researchers have become even more interested in precisely characterising 
the features of quantum theory that set it apart from classical physical theory. Two of these 
features are nonlocality and entanglement, both of which are "provably hard" to characterise; 
that is, deciding whether a quantum state exhibits nonlocality or entanglement is as hard as 
some of the hardest and most important problems in complexity theory. 

This thesis concentrates on the latter problem of deciding whether a quantum state is 
unentangled, or, separable. I review all of the deterministic algorithms proposed for the 
separability problem, including two of my own, in an attempt to discover which has the best 
asymptotic complexity. Along the way, I look at entanglement witnesses in a new light and 
discuss the computational complexity of the separability problem. 

In Section II. 1[ I review some elements of quantum mechanics and define and give the 
significance of separable states. The remainder of the chapter discusses partial solutions to 
the separability problem. 

1.1 Quantum physics 

The pure state of a d-dimensional quantum physical system is represented mathematically 
by a complex unit-vector 1 G C d , where the "global phase" of is irrelevant; that 
is, for any real (p, e % ^ represents the same physical state as \ip). If the system can be 
physically partitioned into two subsystems (denoted by superscripts A and B) of dimensions 
M and N, such that d = MN, then may be separable, which means \tp) = \ip A ) <8> l^), 

1 Some conventions do not require the normalisation constraint; i.e. sometimes it is useful to work 
without it and refer to "unnormalised states". 
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for |^ A ) G C A/ and |^ B ) G and where "(g)" denotes the Kronecker (tensor) product. 
Without loss of generality, assume M < N unless otherwise stated. If \ip) is not separable, 
then it is entangled (with respect to that particular partition). 

More generally, the state of the system may be a mixed state, which is a statistical dis- 
tribution of pure states. A mixed state p is usually represented as the density operator 
P = Yn=\Pi \i>i) where |^) G C d , Yn=iPi = 1 ,Pi> °, and (^| is the dual vector of |^). 
A mixed state is thus a positive semidefinite (and hence Hermitian, or self-adjoint) operator 
with unit trace 2 : p > and tr(p) = 1. Denote the set of all density operators mapping 
complex vector space V to itself by T>(V); let T>m,n '■= ^(C M (g) C ). The maximally mixed 
state is Im,n '■= I/MN, where / denotes the identity operator. A density operator p satisfies 
< tr(p 2 ) < 1 and represents a pure state if and only if tr(p 2 ) = 1. A pure state is 
separable if and only if tr B (|'?/')(V ; |) is a pure state, where "tr B " denotes the partial trace 
with respect to subsystem B (e.g. see Exercise 2.78 in 3|); a pure state is called maximally 
entangled if tr-B(\ijj) is the maximally mixed state I/M in the space of density operators 
on the A-subsystem D(C M ). Thus, the mixedness of tr^ ( ) is some "measure" of the 
entanglement of \ip) (see Section H.H.Hjl . 

A mixed state p G T>m,n is separable if and only if it may be written p = Yli=i Pipt ® Pf 
with pi > and X^P* = ^ anc ^ where pf G P(C M ) is a (mixed or pure) state of the A- 
subsystem (and similarly for pf G V(C N )); when k — 1, p is a product state. Let Sm,n C T>m,n 
denote the separable states; let £m,n '■= T^m,n \ <Sm,n denote the entangled states. The 
following fact will be used several times throughout this thesis: 

Fact 1 ([3]). If a G Sm,n, then o may be -written as a convex combination of M 2 N 2 pure 
product states, that is, 

M 2 N 2 

°= E ftl^X^I ® (i-i) 



i=l 



w/iere ^i=i Pi = 1 and < p» < 1 /or all i = 1,2, ... , M 2 N 2 . 

Recall that a set of points {xi, . . . ,Xj} C M™ is afftnely independent if and only if the set 
{x2 — Xi,X3 — Xi, . . . , Xj — X\\ is linearly independent in IR n . Recall also that the dimension of 
X C M n is defined as the size of the largest affinely- independent subset of X minus 1. Fact^ 
is based on the well-known theorem of Caratheodory that any point in a compact convex set 
X C W 1 of dimension k can be written as a convex combination of k + 1 affinely-independent 
extreme points of X. 



2 The previous footnote applies here, too. 
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Definition 1 (Formal quantum separability problem). Let p G T>m,n be a mixed state. 
Given the matrix 3 [p] (with respect to the standard basis of C M <8> C ) representing p, decide 
whether p is separable. 

What is the significance of a separable state? For a pure state = |^ A ) <8> |^ B ), we 
can imagine two spatially separated people (laboratories) - called "Alice" ("A") and "Bob" 
("B") - who each have one part of \ip): Alice has |^ A ) and Bob has |"0 B )- We can further 
imagine that Alice and Bob each prepared their respective part of the state i.e. Alice 
prepared a pure state \ip A ) and Bob prepared a pure state |^ B ), and \tp) describes the state 
of the union of Alice's system and Bob's system. 

In preparing their systems, Alice and Bob could use classical randomness. Thus, instead 
of preparing the pure state |^ A ) with probability 1, Alice prepares the state |^ A ) with 
probability pf. By imagining infinitely many repeated trials of this whole scenario, this 
means Alice prepares the mixed state p A = '52 i p A \i>i'){ipi'\- Similarly, Bob could prepare his 
subsystem in the mixed state p B . The state of the total system is then represented by p A ®p B . 
States of this form can thus be prepared with local (randomised) operations. 

Now suppose that Alice and Bob can telephone each other. Then they could coordinate 
their subsystem-preparations: when Alice (through her local randomness) decides (with prob- 
ability p,j) to prepare \ip A ), she tells Bob to prepare \ipf). The state of the total system is 
now represented by 

i 

which may not have a representation of the form p A <g) p B . States of the form ()1.2|) can thus 
be prepared with local operations and classical communication (abbreviated "LOCC"). These 
are the separable states. Instead of a telephone (two-way classical channel), it suffices that 
Alice and Bob share a source of randomness in order to create a separable state. 

If Alice and Bob share an entangled state (perhaps Alice prepared the total system and 
then sent the B-subsystem to Bob), then they share something that they could not have made 
with LOCC. Perhaps unsurprisingly, it turns out that sharing certain types of entangled states 
(see Section I2.2.2|) allows Alice and Bob to communicate in ways that they could not have 
with just a telephone 0,0. 

3 We do not yet define how the entries of this matrix are encoded; at this point, we assume all 
entries have some finite representation (e.g. "\/2") and that the computations on this matrix can 
be done exactly. 
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1.2 One-sided tests and restrictions 

Shortly after the importance of the quantum separability problem was recognised in the 
quantum information community, efforts were made to solve it reasonably efficiently In 
this vein, many one-sided tests have been discovered. A one-sided test (for separability) is 
a computational procedure (with input [p]) whose output can only every imply one of the 
following (with certainty): 

• p is entangled (in the case of a necessary test) 

• p is separable (in the case of a sufficient test). 

There have been many good articles (e.g. ODD which review the one-sided (necessary) 
tests. As this thesis is concerned with algorithms that are both necessary and sufficient tests 
for separability for all M and N - and whose computer-implementations have a hope of being 
useful in low dimensions - I only review in detail the one-sided tests which give rise to such 
algorithms (see Section [Qj) . But here is a list of popular conditions on p giving rise to efficient 
one-sided tests for finite-dimensional bipartite separability: 

Necessary conditions for p to be separable 

• PPT test jsjj: p Ts > 0, where "7b" denotes partial transposition 

• Reduction criterion p A Cg> / — p > and / <8> p B — p > 0, where pA '■— tr B (p) and 
"tr B " denotes partial trace (and similarly for p B ) 

• Entropic criterion for a = 2 and in the limit a — > 1 [3|: S a (p) > m&x{S a (pA), S a (pB)}; 
where, for a > 1, S a (p) := -A^\n(ti(p a )) 



Majorisation criterion Aj, -< A^ A and Aj, -< A^ B , where Aj: is the list of eigenvalues of r 
in nonincreasing order (padded with zeros if necessary), and x -< y for two lists of size s if 
and only if the sum of the first k elements of list x is less than or equal to that of list y for 
k = 1,2, ...,s; the majorisation condition implies max{rank(p A ), rank(p B )} < rank(p). 



< 1, where ||-Xl|i : : 



• Computable cross- norm/reshuffling criterion 0, 0|: ||W(p)||i 
tr(vJOX) is the trace norm; and W(p), an M 2 x N 2 matrix, is defined on prod- 
uct states as IA(A <g> B) := v(A)v(B) T , where, relative to a fixed basis, [w(^4)] = 
(coli([A]) T , . . . , co1m([^4]) t ) t (and similarly for v(B)), where coL,([Aj) is the ith col- 
umn of matrix [A]; more generally Q|, any linear map IA that does not increase the 
trace norm of product states may be used. 

Sufficient conditions for p to be separable 

• Distance from maximally mixed state (see also jl^|): 
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- Q: e.g. tr(p - I M , N ) 2 < 1/MN(MN - 1) 

■^niin (p) > (2 + MN) 1 , where A min (p) denotes the smallest eigenvalue of p 
• When M = 2 0: p = P Ta . 

When p is of a particular form, the PPT test is necessary and sufficient for separability. 
This happens when 



• MN < 6 |2fj; or 

• rank(p) < N 0, , see also |22|. 

The criteria not based on eigenvalues are obviously efficiently computed i.e. computing the 
natural logarithm can be done with a truncated Taylor series, and the rank can be computed 
by Gaussian elimination. That the tests based on the remaining criteria are efficiently com- 
putable follows from the efficiency of algorithms for calculating the spectrum of a Hermitian 
operator. 4 The method of choice for computing the entire spectra is the QR al gor ithm (see 



any of |2J, |2£| ) , which has been shown to have good convergence properties |2J 



In a series of articles 



[ 3 , 0], 2l| ) , various conditions for separability were obtained 
which involve product vectors in the ranges of p and p Tjl . Any constructive separability checks 
given therein involve computing these product vectors, but no general bounds were obtained 
by the authors on the complexity of such computations. 



1.3 One-sided tests based on semidefinite programming 

Let Mm,n denote the set of all Hermitian operators mapping C M ® C N to C M <g> C^; 
thus, T> M)N C M.m,n- This vector space is endowed with the Hilbert-Schmidt inner product 
(X,Y) = tr(AB), which induces the corresponding norm \\X\\ = \J tr(X 2 ) and distance 
measure \ \X — Y\\. By fixing an orthogonal Hermitian basis for Mm,n, the elements of EIm,7v 
are in one-to-one correspondence with the elements of the real Euclidean space M. m2n2 . If the 
Hermitian basis is orthonormal, then the Hilbert-Schmidt inner product in M.m,n corresponds 
exactly to the Euclidean dot product in M. m2n2 . 

Thus T>m,n an d Sm,n m &y be viewed as subsets of the Euclidean space M. m2n2 ; actually, 
because all density operators have unit trace, Vm,n and Sm,n are full-dimensional subsets 
of 1R m2jv2_1 . This observation aids in solving the quantum separability problem, allowing us 
to easily apply well-studied mathematical-programming tools. Below, I follow the popular 
review article of semidefinite programming in [28] . 

4 Note that p Ts and p A are Hermitian. 
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Definition 2 (Semidefinite program (SDP)). Given the vector c £ IR m and Hermitian 

matrices Fi £ C nxn , z = 0, 1, . . . , m, 

minimise c T x (1-3) 
subject to: F(x) > 0, (1.4) 

where F(x) := F + 

Call x ('primal) feasible when > 0. When c = 0, the SDP reduces to the semidefinite 

feasibility problem, which is to find an x such that F(x) > or assert that no such x exists. 
Semidefinite programs can be solved efficiently, in time 0(m 2 n 2 ' 5 ). Most algorithms are 
iterative. Each iteration can be performed in time 0(m 2 n 2 ). The number of required iterations 
has an analytical bound of 0(\/n), but in practice is more like 0(log(n)) or constant. 

Let Mm {M^) denote the set of all Hermitian operators mapping C M to C M (C to C^). 
The real variables of the following SDPs will be the real coefficients of some quantum state 
with respect to a fixed Hermitian basis of Mm,n- The basis will be separable, that is, made 
from bases of Mm and Hjy It is usual to take the generators of SU(M) (the generalised Pauli 
matrices) as a basis for Mm (see e.g. j^). 



1.3.1 A test based on symmetric extensions 

Consider a separable state a = X^PilV^) ® IV'fX'^l; an d consider the following 
symmetric extension of a to k copies of subsystem A (k>2): 

i 

The state &k is so called because it satisfies two properties: (i) it is symmetric (unchanged) 
under permutations (swaps) of any two copies of subsystem A; and (ii) it is an extension of 
a in that tracing out any of its (k — 1) copies of subsystem A gives back a. For an arbitrary 
density operator p £ P(C M ®C 7V ), define a symmetric extension of p to k copies of subsystem 
A (C M ) as any density operator p 1 £ T>{{ ( C M )® k ®<£ N ) that satisfies (i) and (ii) with p in place 
of a. It follows that if an arbitrary state p does not have a symmetric extension to ko copies 
of subsystem A for some k , then p ^ Sm,n (else we could construct p ko ). Thus a method for 
searching for symmetric extensions of p to k copies of subsystem A gives a sufficient test for 
separability. 

Doherty et al. (joi 0] showed that the search for a symmetric extension to k copies of 
p (for any fixed k ) ca n be phrased as a SDP. This result, combined with the "quantum de 
Finetti theorem" 0, 0] that p £ Sm,n if and only if, for all k, p has a symmetric extension 
to k copies of subsystem A, gives an infinite hierarchy (indexed by k — 2, 3, . . .) of SDPs 
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with the property that, for each entangled state p, there exists a SDP in the hierarchy whose 
solution will imply that p is entangled. 

Actually, Doherty et al. develop a stronger test, inspired by Peres' PPT test. The state a k , 
which is positive semidefinite, satisfies a third property: (iii) it remains positive semidefinite 
under all possible partial transpositions. Thus a k is more precisely called a PPT symmetric 
extension. The SDP can be easily modified to perform a search for PPT symmetric exten- 
sions without any significant increase in computational complexity (one just needs to add 
constraints that force the partial transpositions to be positive semidefinite). This strengthens 
the separability test, because a given (entangled) state p may have a symmetric extension to 
k copies of subsystem A but may not have a PPT symmetric extension to ko copies of sub- 
system A (Doherty et al. also show that the (k + l)st test in this stronger hierarchy subsumes 
the fcth test). 

The final SDP has the following form: 

minimise 

subject to: X k > (1.6) 

(x k ) T > > o, j e J, 

where X k is a parametrisation of a symmetric extension of p to k copies of subsystem A, 
and J is the set of all subsets of the (k + 1) subsystems that give rise to inequivalent partial 
transposes {X k ) Tj of X k . By exploiting the symmetry property, the number of variables 

of the SDP is m = (d| fe — M 2 )N 2 , where ds k = ( ^ \ is the dimension of the 

symmetric subspace of (C M )® h . The size of the matrix X k for the first constraint is d 2 Sk N 2 . 
The number of inequivalent partial transpositions is | J\ = k. 5 The constraint correspon ding 
to the transposition of I copies of A, I = 1, 2, k — 1, has a matrix of size d 2 d 2 _ N 2 |3lJ. 
I will estimate the total complexity of this approach to the quantum separability problem in 
Section HS21 



1.3.2 A test based on semidefinite relaxations 

Doherty et al. formulate a hierarchy of necessary criteria for separability in terms of 
semidefinite programming - each separability criterion in the hierarchy may be checked by a 
SDP. As it stands, their approach is manifestly a one-sided test for separability, in that at 
no point in the hierarchy can one conclude that the given [p] corresponds to a separable state 

5 Choices are: transpose subsystem B, transpose 1 copy of subsystem A, transpose 2 copies of 
subsystem A, transpose k — 1 copies of subsystem A. Transposing all k copies of subsystem A is 
equivalent to transposing subsystem B. Transposing with respect to both subsystem B and I copies 
of subsystem A is equivalent to transposing with respect to k — I copies of subsystem A. 
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(happily, recent results show that this is, practically, not the case; see Section 13. 3. 2 j) . 

Soon after, Eisert et al. Q] had the idea of formulating a necessary and sufficient criterion 
for separability as a hierarchy of SDPs. Define the function 

E 4 (p) := min tr((p - xf) (1.7) 

for p G T>m,n- As tr((p — x) 2 ) is the square of the Euclidean distance from p to x, p is separable 
if and only if E d 2(p) = 0. The problem of computing E d 2(p) (to check whether it is zero) is 
already formulated as a constrained optimisation. The following observation helps to rewrite 
these constraints as low-degree polynomials in the variables of the problem: 6 

Fact 2 (Q)- 

Let O be a Hermitian operator and let a 6l satisfy < a < 1. If tr{0 2 ) = a 2 
and tr(0 3 ) = a 3 , then tr(0) = a and rank(0) = 1 (i.e. O corresponds to an unnormalised 
pure state). 

Combining Fact 121 with Fact ^ the problem is equivalent to 
minimise tr((p — YliLi* Aj) 2 ) 



subject to: ^ r (J2iLi N Aj) = 1 

tr((t ri pQ) 2 ) = (trpQ)) 2 , 

for i = 1, 2, . . . , M 2 iV 2 and j G {A, B} 
trCteC*)) 3 ) = (tr(X,)) 3 , 

for i = 1, 2, . . . , M 2 N 2 and j G {A, B}, 

where the new variables are Hermitian matrices X$ for i — 1,2, ... , M 2 N 2 . The constraints 
do not require Xi to be tensor products of unit-trace pure density operators, because the pos- 
itive coefficients (probabilities summing to 1) that would normally appear in the expression 
]^2fLi N Xi are absorbed into the Xi, in order to have fewer variables (i.e. the Xi are con- 
strained to be density operators corresponding to unnormalised pure product states). Once 
an appropriate Hermitian basis is chosen for Mm,n, the matrices Xi can be parametrised by 
the real coefficients with respect to the basis; these coefficients form the real variables of the 
feasibility problem. The constraints in (jl.8|) are polynomials in these variables of degree less 
than or equal to 3. 7 

Polynomially-constrained optimisation problems can be approximated by, or relaxed to, 

6 To see why Fact holds, note that in W 1 the surface {(x±, . . . , x n ) : J27=i x i = a3 } intersects 
the hypersphere {(x±, . . . , x n ) : Y17=i x i = ° 2 } onr y a * the points (a, 0, ... , 0), (0, a, 0, ... , 0), 
(0,...,0,a,0,...,0), (0,...,0,a). 

7 Alternatively, we could parametrise the pure states (composing Xi) in C M and by the real 
and imaginary parts of rectangularly-represented complex coefficients with respect to the standard 
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semideflnite programs, via a number of different approaches (see references in [34J). 8 Some 
approaches even give an asymptotically complete hierarchy of SDPs, indexed on, say, % = 
1,2,.... The SDP at level % + 1 in the hierarchy gives a better approximation to the original 
problem than the SDP at level i\ but, as expected, the size of the SDPs grows with % so 
that better approximations are more costly to compute. The hierarchy is asymptotically 
complete because, under certain conditions, the optimal values of the relaxations converge 
to the optimal value of the original problem as i — > oo. Of these approaches, the method of 
Lasserre j^f is appealing because a computational package [sf| written in MATLAB is freely 
available. Moreover, this package has built into it a method for recognising when the optimal 
solution to the original problem has been found (see 3(| and references therein). Because 
of this feature, the one-sided test becomes, in practice, a full algorithm for the quantum 
separability problem. However, no analytical worst-case upper bounds on the running time 
of the algorithm for arbitrary p G T>m,n are available. 

1.3.3 Entanglement Measures 

The function E d 2(p) defined in Eqn. (jl.7|) . but first defined in j^, is also known as an 
entanglement measure, which, at the very least, is a nonnegative real function defined on 
Dm,n- 9 If an entanglement measure E(p) satisfies 

E( P ) = o & P eS M ,N, (l.io) 



bases of C M and C^: 



minimise 



subject to: tr((p - W)^t\ ® = (1.9) 



This parametrisation hard- wires the constraint that the IV^XV^I <8) \4>f )(i ; f\ are (unnormalised) 
pure product states, but increases the degree of the polynomials in the constraint to 4 (for the unit 
trace constraint) and 8 (for the distance constraint). 

8 For our purposes, the idea of a relaxation can be briefly described as follows. The given problem 
is to solve min xe Kn{p(x) : gk{x) > 0, k = l,...,m}, where p(x),gi(x) : W 1 — > E are real-valued 
polynomials in M[xi, . . . ,x n ]. By introducing new variables corresponding to products of the given 
variables (the number of these new variables depends on the maximum degree of the polynomials 
p,gi), we can make the objective function linear in the new variables; for example, when n = 2 
and the maximum degree is 3, if p(x) = 3x\ + 1x\X2 + ^x\x\ then the objective function is c T y 
with c = (0,3,0,0,2,0,0,0,4,0) S M 10 and y £ M 10 , where 10 is the total number of monomials in 
M[xi,X2] of degree less than or equal to 3. Each polynomial defining the feasible set G := {x € M n : 
9k( x ) > 0,k = 1, . . . , m} can be viewed similarly. A relaxation of the original problem is a SDP with 
objective function c T y and with a (convex) feasible region (in a higher-dimensional space) whose 
projection onto the original space M n approximates G. Better approximations to G can be obtained 
by going to higher dimensions. 

9 For a comprehensive review of entanglement measures (and a whole lot more!), see [38| . 
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then, in principle, any algorithm for computing E(p) gives an algorithm for the quantum 
separability problem. Note that most entanglement measures E do not satisfy (jl.K)j) : most 
just satisfy E(p) = <= p G Sm,n- 

A class of entanglement measures that do satisfy (jl.lOjl are the so-called "distance mea- 
sures" Ed(p) := mm ae s M N d(p,&), fc> r any reasonable measure of "distance" d(x,y) satis- 
fying d(x,y) > and (d(x,y) — 0) <3> (x = y). If d is the square of the Euclidean dis- 
tance, we get E d 2(p). Another "distance measure" is the von Neumann relative entropy 
S(x,y) := tr(x(logx — logy)). 

In Eisert et al.'s approach, we could replace E d 2 by E d for any "distance function" <i(p, er) 
that is expressible as a polynomial in the variables of a. What dominates the running time of 
Eisert et al.'s approach is the implicit minimisation over Sm,n, so using a different "distance 
measure" (i.e. only changing the first constraint in ()1.8|) ) like (tr(p — a)) 2 would not improve 
the analytic runtime (because the degree of the polynomial in the constraint is still 2), but 
may help in practice. 

Another entanglement measure E that satisfies ()1.10|) is the entanglement of formation 



where S(p) := — tr(plog(p)) is the von Neumann entropy. This gives another strategy for a 
separability algorithm: search through all decompositions of the given p to find one that is 
separable. We can implement this strategy using the same relaxation technique of Eisert et 
al., but first we have to formulate the strategy as a polynomially-constrained optimisation 
problem. The role of the function S is to measure the entanglement of \ifji){ipi\ by measuring 
the mixedness of the reduced state tr B (|^) (V'iD- F° r our purposes, we can replace S with any 
other function T that measures mixedness such that, for all p G T>m,n, T(p) > and T(p) = 
if and only if p is pure. Recalling that, for any p G T>m,n, t r (p 2 ) < 1 with equality if and only 
if p is pure, the following function T(p) := 1 — tr(p 2 ) suffices; this function T may be written 
as a (finite-degree) polynomial in the real variables of p, whereas S could not. Defining 



we have that E' F satisfies (jl.lOj) . Using an argument similar to the proof of Lemma 1 in 
j^ij, we can show that the minimum in ()1.12j) is attained by a finite decomposition of p into 
M 2 N 2 + 1 pure states. Thus, the following polynomially-constrained optimisation problem 




(l.n) 




(1.12) 
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can be approximated by semidefinite relaxations: 
minimise J]^^ +1 tr(Xj)T(tr B (Xj)) 



subject to: tv(J2Zi +i X t - \p]f = 

tr(Err +i ^) = i 



tr{Xf) = (tr(X 4 )) 2 , (1.13) 
for % = 1,2 M 2 N 2 + 1 



\3 



tr(Xf) = (tr(X,))' 

for i = l,2,...,M 2 iV 2 + l. 

The above has about half as many constraints as (|1.8jl . so it would be interesting to compare 
the performance of the two approaches. 

1.3.4 Other tests 

There are several one-sided tests which do not lead to full algorithms for the quantum 



separability problem for Sm,n- Brandao and Vianna 4l| have a set of one-sided necessary 
tests based on deterministic relaxations of a robust semidefinite program, but this set is not an 
asymptotically complete hierarchy. The same authors also have a related randomised quantum 
separability algorithm which uses probabilistic relaxations of the same robust semidefinite 
program |42|. Randomised algorithms for the quantum separability problem are outside the 
scope of this thesis. 

Woerdeman has a set of one-sided tests for the case where M — 2. His approach might 
be described as the mirror-image of Doherty et al.'s: Instead of using an infinite hierarchy 
of necessary criteria for separability, he uses an infinite hierarchy of sufficient criteria. Each 
criterion in the hierarchy can be checked with a SDP. 



Chapter 2 
Convexity 



The set of bipartite separable quantum states Sm,n m ~$&m,n is defined as the closed convex 
hull of the separable pure states: 

S M>N := conv{|^ A )(^ A | ® |^ B )(^fl e H Mi at}. (2.1) 

Sm,n is also compact (see e.g. 0). Since the separable states form a convex and compact sub- 
set of K M2Ar2 ) a plethora of well-studied mathematical and computational tools are available 
for the separability problem, as we shall see. 

First, I apply polyhedral theory to show that Sm, n is not a polytope, easily settling an 
open problem. I then review the concept of an entanglement witness and define a new class of 
entanglement witnesses which have some advantage over conventional entanglement witnesses 
in the detection of entanglement. I finish the chapter with a review of the five basic convex 
body problems and their relation to the separability problem. 



2.1 Polyhedra and Sm,n 

The following definitions may be found in j^J (but I use operator notation in keeping 
with the spirit of quantum physics). If A G HW,tv and A ^ and ael, then {x G M.m,n '■ 
tr(Ax) < a} is called the halfspace Ha a - The boundary {x G M.m,n '■ tr(Ax) = a} of 
H^ a is the hyperplane ir^a with normal A. Call two hyperplanes parallel if they share the 
same normal. Let H° Aa denote the interior H^ a \ n A,a of H^a- Note that H°_ A _ a is just 
the complement of B.A, a - The density operators of an M by N quantum system lie on the 
hyperplane n lX . T> M>N = {p G M M>N : p > 0} n 7r 7jl C R^- 1 . 

The intersection of finitely many halfspaces is called a polyhedron. Every polyhedron is a 
convex set. Let D be a polyhedron. A set F C D is a face of D if there exists a halfspace 
H^ a containing D such that F = D H 7iA >a . If v is a point in D such that the set {v} is 
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a face of D, then v is called a vertex oi D. A /acei of Z) is a nonempty face of D having 
dimension one less than the dimension of D. A polyhedron that is contained in a hypersphere 
{x G M.m,n '■ tr(x 2 ) = r 2 } of finite radius r is called a polytope. 

What is the shape of Sm,n i n R M2Ar2_1 (with respect to the Euclidean norm)? Is it a 
polytope? This is an interesting question which arises when considering separability in an 
experimental setting and comparing it to nonlocality (Section | 



Minkowski's theorem 



44| says that every polytope in M n is the convex hull of its finitely 
many vertices (extreme points). Recall that an extreme point of a convex set is one that 
cannot be written as a nontrivial convex combination of other elements of the set. To show 
that Sm,n is not a polytope, it suffices to show that it has infinitely many extreme points. 
The extreme points of Sm,n are precisely the product states, as we now show (see also 0). 
A mixed state is not extreme, by definition. Conversely, we have that 

\^W\ = ^2pi\^i)(H (2-2) 

i 

implies 

1 = J> M \A)(A\ m = 5>| (AW) I 2 , (2-3) 

i i 

which implies that | (ipi\ip) | = 1 for all i; thus, a pure state is extreme. Since Sm,n has 
infinitely many pure product states, we have the following fact, which settles an open problem 
posed in P|. 



Fact 3. Sm n is not a polytope in 



)M 2 N 2 -1 



2.2 Entanglement witnesses 

The compactness of Sm,n and the fact that any point not in a convex set in IR n can be 
separated from the set by a hyperplane imply that for each entangled state p there exists 
a halfspace H^a whose interior H° A contains p but contains no member of Sm, n Q. Call 
A e Mm,n an entanglement witness |4f| if for some a G R 

S M , N nH° Aa = and S M , N nH° A>a ^0. (2.4) 

Entanglement witnesses A with a = in (|2.4|) correspond to the conventional definition of 
"entanglement witness" found in the literature, e.g. [4j|. 
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2.2.1 Experimental separability 

Suppose that a physical property A of a state p may be measured or observed. The result 
of such a measurement is a real number (in practice having finite representation dictated by 
the precision of the measurement apparatus). An axiom of quantum mechanics is that all 
possible real outcomes of measuring property A form the spectrum of a Hermitian operator 
(which we also denote by "A"). We assume that in principle all such physical properties A 
are in one-to-one correspondence with the Hermitian operators acting on the Hilbert space, 
so that any Hermitian operator defines a physical property that can be measured. When 
property A of p is measured in the laboratory, the measurement axiom dictates that the 
expected value of the measurement is 

(A) p :=ti{Ap). 

Such physical properties or Hermitian operators, A, are also called observables. 

Entanglement witnesses can be used to determine that a physical quantum state is entan- 
gled. Suppose A is an EW as in ()2.4|) and that a state p that is produced in the lab is not 
known to be separable. If sufficiently many copies of p may be produced, then measuring the 
observable A (once) on each copy of p gives a good estimate of (A) p which, if less than a, 
indicates that p G H° A a and hence that p is entangled. Otherwise, if (A) p > a, then p may be 
entangled or separable. The best value of a to use in ()2.4|) is a* = mm\^^\ ( z SM N {(ip\ A \ip)} 
since, with this value of a, the hyperplane 7r^ a is tangent to Sm,n an d thus the volume of 
entangled states that can be detected by measuring observable A is maximised. With this in 
mind, define 



a'(A):= mm {<^|A|^} 

\il>)(ip\eS M ,N 

if A is an entanglement witness. 

Much work has been done on entanglement witnesses and their utility in investigating the 
separability of quantum states, e.g. J^], Q|. Entanglement witnesses have been found to be 
particularly useful for experimentally detecting the entanglement of states of the particular 
form p\%fj) (-01 + (1 — p)o~, where \ip) is an entangled state and a is a mixed state close to the 



maximally mixed state and < p < 1 |46l . |4 

2.2.2 Polytopes in separability and nonlocality 

Detection of the entanglement of reproducible physical states in the lab would be straight- 
forward if there were a relatively small number K of entanglement witnesses Aj such that 
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£m,n is contained in 

K 
i=l 

where := a* (A*). This would imply that iS>m,tv is 

K 
i=l 

that is, that Sm,n is a polytope. Alas, it is not (see Section E3]) . But this raises an interesting 
question: 

Problem 1. Given k > M 2 N 2 , find the fc-facet polytope IT containing Sm,n such that the 
volume of II \ Sm,n is minimal. 

Polytope enthusiasts will be happy to know, however, that their favorite convex set plays 
a role in the confounding issue of nonlocality, which I now explain. We know that for any 
entangled state there is always an observable (entanglement witness) acting on the total 
system whose statistics will imply that the state of the system is entangled. We also noted 
earlier that entangled states could not be prepared by Alice and Bob with just LOCC. It 
turns out that the total statistics of some set of local observables on an entangled state can 
also imply that the state is entangled, by revealing the inconsistency with LOCC. 

Alice and Bob share the bipartite system and want to probe its properties by each per- 
forming some local tests independently of each other (for a statistical interpretation, we again 
assume that Alice and Bob will repeat this procedure with identically prepared systems in- 
finitely many times). After performing the tests, they will communicate their results to a 
common location to be analysed. They will want to see if the results of their tests violate an 
assumption that their subsystems are correlated in a way no stronger that what is allowed 
by LOCC. Suppose Alice will choose one of iV A tests (labelled by Ai) to perform, with the 
i A th test having one of iV A mutually exclusive outcomes (labelled by Ai(j)). If Alice's sub- 
system were totally independent of Bob's, then the outcomes of her tests may be thought 
to be governed by a local variable A A which - while possibly uncontrollable or inaccessible - 
may indeed exist (local realism assumption); the possible values that A A may assume are in 
one-to-one correspondence with the possible states of Alice's subsystem. A particular setting 
of A A dictates which outcome each test will have. Thus, for a given set of tests, we can view 
each A A as a Boolean vector of length £\ iV A that is the concatenation of iV A Boolean vectors 
each of length iV A and each having exactly 1 nonzero entry. For example, for N A = 2 and 
iV A = 2 and iV A = 3, a possible A A is A A = (0, 1; 0, 1, 0), which says that test A t will have 
outcome Ai(2) and test A 2 will have outcome A 2 (2). We assume a similar setup on Bob's 
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side. The total hidden variable is then A = (A A , A B ) which dictates Alice's and Bob's results. 
Now B\ : = A A <S> A B is the vector whose entries are probabilities of getting pairs of outcomes 
(conditioned on performing the tests which can give rise to such outcomes). 

Suppose Alice and Bob carry out their experiment which consists of repeated trials, the 
measurements in each trial done simultaneously 1 to prevent Alice's outcome from influencing 
Bob's and vice versa. Let P be the vector of measured (conditional) probabilities of pairs of 
outcomes. Then the statistics are consistent with a LOCC state if and only if 

P E conv({B A } A ), (2.5) 

where conv({5 A } A ) is called the correlation polytope. Note that there is a different correlation 
polytope for every different experimental setup. 2 

A hyperplane which separates P from the correlation polytope (corresponding to some 
experimental setup) corresponds to a "violation of a generalised Bell inequality" 0, ^ 0] , 
which indicates that the state of the system is not separable. However, to show that a state is 
consistent with a local hidden variables theory would require examining all possible correlation 
polytopes and corresponding statistical vectors P i.e. all possible experiments. Experiments 
can also be done on pairs (or triples, etc.) of subsystems at a time, or Alice and Bob could 
perform sequences of tests rather than just single tests. In the case of some "Werner states" 
|5fij |. this more general type of experimental setup gives rise to a violation of a Bell inequality, 
where the simple setup above does not 57(. The strange thing about quantum mechanics is 
that there may exist states whose statistics are consistent with LOCC but which cannot be 
prepared with LOCC; entangled states which pass the PPT test are conjectured to be such 
states. 

2.2.3 Ambidextrous entanglement witnesses 

Suppose that A is not an entanglement witness but that —A is. In this case, an estimate 
of tr(Ap) is just as useful in testing whether p is entangled. We extend the definition of 
"entanglement witness" to reflect this fact: Call A £ M.m,n a left (entanglement) witness if 

1 It follows from the postulates of the theory of relativity that physical influences cannot propagate 
faster than light. More precisely, using the terminology of relativity, we want the measurements to 
be done in a causally disconnected manner. 

2 I have followed the formulation of Peres which is tailored to the nonlocality problem. 

Pitowsky's very general formulation |HlJ has application beyond the nonlocality problem; however, 
it is well suited to tests with two outcomes (Boolean tests), as in photon detectors (which either 
"click" or do not "click"), where it gives a polytope in lower dimension than Peres' construction, e.g. 
compare the treatments of jH^ i n |H3| an d jHO]- P° r tests with more than two outcomes, Pitowsky's 
correlation polytope contains "local junk" - product- vectors (e.g. (1,1,...,1)) which are not valid 
statistical vectors P (an artifact of the generality of the construction which allows for not necessarily 
distinct events). 
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(|2.4jl holds for some a G R, and a ng/ii (entanglement) witness if 

5 m ,jv n iT A _ 6 = and S m ,n Ci Hl A> _ b ^ (2.6) 
for some feel. As well, for A a right witness, define 

b*(A):= max {<^|A|^}. 

|^)<V|65 m ,jv 

Note that A is a left witness if and only if —A is a right witness. 

The operator A G H^at defines the family {iiA,a}am of parallel hyperplanes in M. m2n2 . 
Consider the hyperplane tia '■= ft A tr^j) which cuts through <S/vf at at the maximally mixed 

' MN 

state Imn- When can tta be shifted parallel to its normal so that it separates Sm,n from 
some entangled states? If A is both a left and right witness, then ir A can be shifted either in 
the positive or negative directions of the normal. In this case, the two parallel hyperplanes 
ftA,a*(A) and iiA,b*(A) sandwich Sm,n with some entangled states outside of the sandwich, which 
we will denote by := H-a,-o,*(A) H H_ A ,-b*{A)- 

Definition 3 (Ambidextrous entanglement witness). An operator A G M.m,n is an 
ambidextrous (entanglement) witness if it is both a left witness and a right witness. 

If A is an ambidextrous witness, then p is entangled if (A) p < a* (A) or if (A) p > b* (A). 
We can further define a left-handed witness to be an entanglement witness that is left but 
not right. Say that two entangled states p\ and p 2 are on opposite sides of Sm,n if there 
does not exist a halfspace H A , a such that H° A a contains p\ and p2 but contains no separable 
states. Ambidextrous witnesses have the potential advantage over conventional (left-handed) 
entanglement witnesses that they can detect entangled states on opposite sides of Sm,n with 
the same physical measurement. 

Entanglement witnesses can be simply characterised by their spectral decomposition. In 
the following, suppose A G Mm,n has spectral decomposition A = J^^ -1 Aj|Aj) (Aj| with 
Ao < Ai < . . . < Ajvfjv-i- 

Fact 4. The operator A is a left witness if and only if there exists k G [0, 1, ... , MN — 2] 
such that span({ \ Ao) , |Ai) , . . . , |Afc)}) contains no separable pure states and X^+i > A&. 

Proof. Suppose first that there exists no such k. Then | Ao) is, without loss of generality, 
a separable pure state (because the eigenspace corresponding to A must contain a prod- 
uct state), so A cannot be a left witness. To prove the converse, suppose that such a k 
does exist and that A^+i > A*.. Define the real function f(a) := tr(Aer) on Sm,n- Since 
span({|A ) , | Ai) , . . . , |Afc)}) contains no separable states and A^+i > A&, the function satisfies 
/(er) > Aq. Since the set of separable states is compact, there exists a separable state a' that 
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minimises f{cr). Thus, setting a := f(a J ) gives Sm,n H F] a = 0. As well, £m,jv H H° Aa ^ 
since tr(A|Ao)(Ao|) = Ao < a, and so A is a left witness. □ 

Theorem 5. TTie operator A is a left or right entanglement witness if and only if (i) there 
exists k G [0, 1, ... , MN — 2] such that span{ \Xq) , |Ai) , . . . , \Xk)} contains no separable pure 
states and Xk+i > ^k, or (H) there exists I G [1,2,..., MN — 1] such that 
span{\Xi) , |A/ + i) , . . . , |Amjv-i)} contains no separable pure states and \i > 

Theorem |3] immediately gives a method for identifying and constructing entanglement 
witnesses. 



Definition 4 (Partial Product Basis, Unextendible Product Basis 58]). A partial 
product basis of C M <S> C N is a set S of mutually orthonormal pure product states spanning a 
proper subspace of C M <S> C . An unextendible product basis of C M (8) is a partial product 
basis S of C M (g) whose complementary subspace (spanS 1 )- 1 contains no product state. 

We can use unextendible product bases to construct ambidextrous witnesses. Suppose B is 
an unextendible product basis of C M ® C N , and let B' be disjoint from B such that B U B' 
is an orthonormal basis of C ® C . One possibility is the left witness defined by A 1 as 

^ = -Ei A >< A i ( 2 - 7 ) 

|A>ei?' 

As well, we could split B' into B' L and B' R and define an ambidextrous witness A" as 

A " = - E i^)( a ^i+ E mm- ( 2 - 8 ) 

\^l)&b' l \x R )eB' R 

Another thing to realise is that spani? may contain an entangled pure state, which can be 
pulled out and put into a (+l)-eigenvalue eigenspace of A'. Depending on B (and the dimen- 
sions M, N), there may be several mutually orthogonal pure entangled states in spani? whose 
span contains no product state; let B" be a set of such pure states. Define the ambidextrous 
witness as 

A'" = - E l A >< A l+ E l A >< A l- ( 2 - 9 ) 

|A)gB' |A>eB" 

This suggests the following problem, related to the combinatorial 0] problem of finding 
unextendible product bases: 

Problem 2. Given M and N, find all orthonormal bases B for C M (8) such that 
• B is the disjoint union of A L , B, A R , 
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• spanA^ and spanA/? contain no product state, 

• span(A^ U A#) contains a product state, and 

• min{|A^|, |A#|} is maximal. 

Such bases may give "optimal" ambidextrous witnesses, which detect the largest volume of 
entangled states on opposite sides of Sm,n- 

We will see in Chapter |U that the functions a* (A) and b* (A) are difficult (NP-hard) to 
compute. Thus a criticism of constructing witnesses via the spectral decomposition is that 
even if you can construct the corresponding observable, you still have to perform a difficult 
computation to make them useful. However, most experimental applications of entanglement 
witnesses are in very low dimensions, where computing a* (A) and b* (A) deterministically is 
not a problem - it may even be done analytically, as in the example below. 



Example: Noisy Bell states 

A simple illustration of how AEWs may be used involves detecting and distinguishing 
noisy Bell states. Define the four Bell states in C 2 (8) C 2 : 



|^> := (|00)±|ll))/v / 2 
1^) := (|01)±|10»A/2. 



It is straightforward to show that the Bell states are, pairwise, on opposite sides of <S 2i2 . 3 
Define the operators 

A* : =-|0-)(0-| + |0+)(0+|. 
Both Aj/j and A^ are easily seen to be AEWs. It is also straightforward to compute the values 



a*(A i> ) = a*(A 4) ) = -1/2 



3 Suppose a left entanglement witness W, with a*(W) 
of generality, W can be written in the Bell basis ; \ <ft + ] 

—€\ a + bi x x 
a — bi —e 2 x x 
x xxx 
x xxx 



0, detects \ip^ 
),...} as 



and 



W 



Without loss 



(2.10) 



for ei and 62 both positive. But the states \s~ 



7=(|V' + ) ± |<A + )) are separable. Requiring 



V2 



(s + \ W\s + ) > gives 2a > ei + and requiring (s \W\s ) > gives 2a < —e\ — €2, which, 
together, give a contradiction. Similar arguments hold for the other pairs of Bell states. 



CHAPTER 2. CONVEXITY 



20 



and 

b*(A lp ) = b*(A^) = +1/2. 

Suppose that there is a source that repeatedly emits the same noisy Bell state p and that we 
want to decide whether p is entangled. Define the Pauli operators: 

^o:= ^(|0)(0| + |1)(1|) 
*!■■= 7i(|0)(l| + |l)(0|) 
°2 :=-73(|0)(l|-|l)(0|) 
^(|0)(0|-|1)(1|), 

where {|0) , |1)} is the standard orthonormal basis for C 2 . Noting that 

A^p = a x ® <J\ - o 2 <g> 2 
Atf, = o\ ® o x + cr 2 <g> cr 2 , 

measuring the expected value of the two observables o\ ® o\ and a 2 ® a 2 may be sufficient to 
decide that p is entangled because p G £2,2 if one of the following four inequalities is true: 

(01 ® <ri) P ± (0-2 ® 0-2 ) p > 1/2 (2.11) 
(01 ® 0i) P ± (02 ® 2 ) P < -1/2. 

If the noise is known to be of a particular form, then we can also determine which noisy 
Bell state was being produced. Let \B) be a Bell state. Suppose p is known to be of the 
form p\B)(B\ + (1 — p)a for some inside both sandwiches W(A^) and W(A ( / > ). With so 
defined, one of the four inequalities (j2.11j) holds only if exactly one of them holds, so that 
\B) is determined by which inequality is satisfied. We remark that, if and \B) are known, 
knowledge of the expected value of any single observable A may allow one to compute p and 
hence an upper bound on the l 2 distance between p and the maximally mixed state 1/4. This 
distance may be enough information to conclude that p is separable by checking if p is inside 
the largest separable ball centered at I /A 

2.3 Convex body problems 

I end this chapter with a brief review of some basic problems for a convex subset K of 
W 1 and their meaning in terms of the separability problem when K = Sm,n- In Chapter 
the relationship among these problems will be exploited to solve the quantum separability 
problem. 
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We have already noted that S m ,n may be viewed clS db subset of R . Let us be 

more precise. Let £> = {Xj : i = 0, 1, . . . ,M 2 N 2 — 1} be an orthonormal, Hermitian basis 
for EI a/, n, where Xq = For concreteness, we can assume that the elements of B are 

tensor- pro ducts of the (suitably normalised) canonical generators of SU(M) and SU(N), given 
e.g. in Q. Note tr(X 4 ) = for all i > 0. Define v : H M)JV -> K^^" 1 as 



v(A) :- 



tr{XxA) 
tr{X 2 A) 

tl(X M 2 N 2_ 1 A) 



(2.12) 



Via the mapping v, the set of separable states Sm,n can be viewed as a full-dimensional convex 
subset of R M2N2 -t 

{v(cr) G R^ 2 ^ 2 - 1 : a G S m ,n}, (2.13) 

which properly contains the origin v (I MtN ) = G R M N - 1 (recall that there is a ball of 
separable states of nonzero radius centred at the maximally mixed state Im,n) ■ For traceless 
Ai,A 2 G M.m,n> we clearly have tr(AxA 2 ) = w(A 1 ) T w(yl 2 ). For A e H^jy and p G T> MfN , 
where A := J2iio N ~ l aiXi and p := X^o^ ~^ Pi^h we nave tr(Ap) = a po + t'(A) T i;(p). But 
Po is fixed at 1/v MN for all p G Pm,jv- Thus, in terms of entanglement witnesses A, we might 
as well restrict to those A that have «o = 0; that is, we may restrict to traceless entanglement 
witnesses without loss of generality. In the definitions below, the vector c corresponds to a 
traceless right entanglement witness when K = Smn- 

n 

The following definitions can be found in [1||. 

Definition 5 (Strong Membership Problem (SMEM)). Given a point p G W 1 , decide 
whether p G K. 

Definition 6 (Strong Separation Problem (SSEP)). Given a point p G M. n , either assert 
that p G K, or find a vector c G IR" such that c T p > max{c T x|x G K}. 

For K = Sm,n, SMEM corresponds exactly to the formal quantum separability problem in 
Definition SSEP also solves SMEM, but, in the case where p represents an entangled 
state, also provides a right entanglement witness (note how the unconventional definition of 
"entanglement witness" fits nicely here). 

Definition 7 (Strong Optimisation Problem (SOPT)). Given a vector c G W 1 , either 
find a point k G K that maximises c T x on K, or assert that K is empty. 



SOPT corresponds to the problem of calculating b* (A) for a potential right entanglement 



CHAPTER 2. CONVEXITY 



22 



witness A. The optimisation problem over Sm,n will continue to play a major role throughout 
this thesis. 

Definition 8 (Strong Validity Problem (SVAL)). Given a vector c G M. n and a number 
7 G R, decide whether c T x < 7 holds for all x G if. 

For if = Sm,n, SVAL asks, "Given a potential right entanglement witness A and a number 
b, is 6* (A) < 6?" 

Let if' be a convex subset of IR n . 

Definition 9 (Strong Violation Problem (SVIOL)). Given a vector d G W 1 and a 
number 7 G R, decide whether <i T a; < 7 holds for all a; G if', and, if not, find a vector y E K' 
with cFy > 7. 

Note that taking d — and 7 = — 1, the strong violation problem reduces to the problem 
of checking whether if is empty, and if not, finding a point in if'. This problem is called 
the Feasibility Problem and will arise in Chapters |U and El (but not for if' equal to Sm,n, 
which is why I switched notation from "if" to "if"' to define this problem). 



Chapter 3 

Separability as a Computable Decision 
Problem 



Definition gave us a concrete definition of the quantum separability problem that we 
could use to explore some important results. Now we step back from that definition and con- 
sider more carefully how we might define the quantum separability problem for the purposes 
of computing it. 

For a number of reasons, we settle on approximate formulations of the problem and give 
a few examples that are, in a sense, equivalent. I then formulate the quantum separability 
problem as an NP-hard problem in NP. I end the chapter with a survey of algorithms for 
the approximate quantum separability problem; one of the algorithms comes directly from 
a second NP-formulation and can be considered as the weakening of a recent algorithm by 
Hulpke and Brufi [6fl| . 



3.1 Formulating the quantum separability problem 

The nature of the quantum separability problem and the possibility for quantum computers 
allows a number of approaches, depending on whether the input to the problem is classical 
(a matrix representing p) or quantum (T copies of a physical system prepared in state p) and 
whether the processing of the input will be done on a classical computer or on a quantum 
computer. In Chapter |21 we dealt with the case of a quantum input and very limited quantum 
processing in the form of measurement of each copy of p; we will deal with this case in more 
detail in Chapter HJ The case of more-sophisticated quantum processing on either a quantum 
or classical input is not well studied (see for an instance of more-sophisticated quantum 
processing on a quantum input). For the remainder of this chapter, I focus on the case where 
input and processing are classical. 
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3.1.1 Exact formulations 

Let us examine Definition ^ (or, equivalently, Definition EJ) from a computational view- 
point. The matrix [p] is allowed to have real entries. Certainly there are real numbers that 
are uncomputable (e.g. a number whose nth binary digit is 1 if and only if the nth Turing 
machine halts on input n); we disallow such inputs. However, the real numbers e, ir, and y/2 
are computable to any degree of approximation, so in principle they should be allowed to ap- 
pear in [p\. In general, we should allow any real number that can be approximated arbitrarily 
well by a computer subroutine. If [p] consists of such real numbers (subroutines), say that "p 
is given as an approximation algorithm for [p]." In this case, we have a procedure to which we 
can give an accuracy parameter 5 > and out of which will be returned a matrix [p}$ that is 
(in some norm) at most 5 away from [p]. Because Sm,n is closed, the sequence ([p]i/ n )n=i,2,... 
may converge to a point on the boundary of Sm,n (when p is on the boundary of Sm,n)- F° r 
such p, the formal quantum separability problem may be "undecidable" because the 5-radius 



ball centred at [p]s may contain both separable and entangled states for all S > £2| (more 
generally, see "Type II computability" in [^j]). 

If we really want to determine the complexity of deciding membership in Sm,n, it makes 
sense not to confuse this with the complexity of specifying the input. To give the computer a 
fighting chance, it makes more sense to restrict to inputs that have finite exact representations 
that can be readily subjected to elementary arithmetic operations begetting exact answers. 
For this reason, we might restrict the formal quantum separability problem to instances where 
[p] consists of rational entries: 

Definition 10 (Rational quantum separability problem (EXACT QSEP)). Let p £ 

T)m,n be a mixed state such that the matrix [p] (with respect to the standard basis of C M <E)C N ) 
representing p consists of rational entries. Given [p], is p separable? 

As pointed out in |3Jj, Tarski's algorithm 1 can be used to solve EXACT QSEP exactly. 
The Tarski-approach is as follows. Note that the following first-order logical formula 2 is true 
if and only if p is separable: 

VA[(Vtf(tr(Atf) > 0)) -> (trAp > 0)], (3.1) 



1 Tarski's result is often called the "Tarski-Seidenberg" theorem, after Seidenberg, who found a 
slightly better algorithm (and elaborated on its generality) in 1954, shortly after Tarski managed 
to publish his; but Tarski discovered his own result in 1930 (the war prevented him from publishing 
before 1948). 

2 Recall the logical connectives: V ("OR"), A ("AND"), -, ("NOT"); the symbol ("IMPLIES"), 
in "x — ► y" , is a shorthand, as u x — > y" is equivalent to tt (-<x) V y" ; as well, we can consider "x V y" 
shorthand for "— ■((— ix) A Also recall the existential and universal quantifiers 3 ("THERE 

EXISTS") and V ("FOR ALL"); note that the universal quantifier V is redundant as "Vx^>(x)" is 
equivalent to u ^3x-i<f>(x)" . 
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where A £ HIm.tv and \& is a pure product state. To see this, note that the subformula 
enclosed in square brackets means "A is not a (left) entanglement witness for p" , so that if 
this statement is true for all A then there exists no entanglement witness detecting p. When 
[p] is rational, our experience in Section 11.3.21 with polynomial constraints tells us that the 
formula in (J3.1)) can be written in terms of "quantified polynomial inequalities" with rational 
coefficients: 

VX{(VF [Q(Y) -> (r(X, Y) > 0)]) - (s(X) > 0)}, (3.2) 

where 

• X is a block of real variables parametrising the matrix A £ M.m,n (with respect to an 
orthogonal rational Hermitian basis of M.m,n)> the "Hermiticity" of X is hard-wired by 
the parametrisation; 

• Y is a block of real variables parametrising the matrix \&; 

• Q(Y) is a conjunction of four polynomial equations that are equivalent to the four 
constraints tr((tr.,(\I/)) 2 ) = 1 and tr((tr,,(\I/)) 3 ) = 1 for j £ {A,B}; 

• r(X,Y) is a polynomial representing the expression tr(A^); 3 

• s(X) is a polynomial representing the expression tr(A[p]). 

The main point of Tarski's result is that the quantifiers (and variables) in the above sentence 
can be eliminated so that what is left is just a formula of elementary algebra involving Boolean 
connections of atomic formula of the form (a o 0) involving terms a consisting of rational 
numbers, where o stands for any of <,>,=, 7^; the truth of the remaining (very long) formula 
can be computed in a straightforward manner. The best algorithms for deciding ()3.2)1 require 
a number of arithmetic operations roughly equal to (P_D)°(l x l) x0 (l y l) ; where P is the number 
of polynomials in the input, D is the maximum degree of the polynomials, and \X\ {\Y\) 
denotes the number of variables in block X (Y) Since P = 6 and D = 3, the running 

time is roughly i%°( m2n2 ) x0 ( m2n2 ) (times the length of the encoding of the rational inputs). 

3 To ensure the Hermitian basis is rational, we do not insist that each of its elements has unit 
Euclidean norm. If the basis is {Xi} i=01 ^ M 2 N 2, where Xq is proportional to the identity opera- 
tor, then we can ignore the Xq components write A = YltLi* AiXi and ^ = J2i=i N ^iXi- An 
expression for tv(A^) in terms of the real variables A, and may then look like X^i^ Ai^itic(Xf). 

4 Ironically, due to some computer font incompatibility, my copy of this paper, entitled "On 
the computational and algebraic complexity of quantifier elimination," did not display any of the 
quantifiers. 
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3.1.2 Approximate formulations 

The benefit of EXACT QSEP is that, compared to Definition it eliminated any uncer- 
tainty in the input by disallowing irrational matrix entries. Consider the following motivation 
for an alternative to EXACT QSEP, where, roughly, we only ask whether the input [p] corre- 
sponds to something close to separable: 

• Suppose we really want to determine the separability of a density operator p such that 
[p] has irrational entries. If we use the EXACT QSEP formulation (so far, we have no 
decidable alternative), we must first find a rational approximation to [p\. Suppose the 
(Euclidean) distance from [p] to the approximation is 5. The answer that the Tarski- 
style algorithm gives us might be wrong, if p is not more than 5 away from the boundary 
of <Sai,n- 

• Suppose the input matrix came from measurements of many copies of a physical state 
p. Then we only know [p] to some degree of approximation. 

• The best known Tarski-style algorithms for EXACT QSEP have gigantic running times. 
Surely, we can achieve better asymptotic running times if use an approximate formula- 
tion. 

Thus, in many cases of interest, insisting that an algorithm says exactly whether the input 
matrix corresponds to a separable state is a waste of time. In Section I3.2.2j we will see that 
there is another reason to use an approximate formulation, if we would like the problem to 
fit nicely in the theory of NP-completeness. 

Gurvits was the first to use the weak membership formulation of the quantum separability 
problem 0, For x G lR n and 5 > 0, let B(x,5) := {y G lR n : | \x — y\ | < 5}. For a convex 
subset K C R n , let S(K,S) : = U xeK B(x } 5) and S(K, -5) := {x : B(x,5) C K}. 

Definition 11 (Weak membership problem (WMEM)). Given a rational vector p G W 1 
and rational 5 > 0, assert either that 

V G S(K,S), or (3.3) 
P £S{K,-5). (3.4) 

Denote by WMEM(Sm,n) the quantum separability problem formulated as the weak mem- 
bership problem. An algorithm solving WMEM(Sm,n) is a separability test with two-sided 
"error" 5 in the sense that it may assert (J3.3|) when p represents an entangled state and may 
assert ()3.4j) when p represents a separable state. Any formulation of the quantum separability 
problem will have (at least) two possible answers - one corresponding to u p approximately 
5 Of course, relative to the problem definition, there is no error. 
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represents a separable state" and the other corresponding to "p approximately represents an 
entangled state". Like in WMEM(«Sm,at), there may be a region of p where both answers are 
valid. We can use a different formulation where this region is shifted to be either completely 
outside Sm,n or completely inside Sm,n- 

Definition 12 (In-biased weak membership problem ( WMEMi„) ) . Given a rational 
vector p G lR n and rational 5 > 0, assert either that 

p G S(K,5), or (3.5) 
p i K. (3.6) 

Definition 13 (Out-biased weak membership problem (WMEMout))- Given a ratio- 
nal vector p G W 1 and rational 5 > 0, assert either that 

p G K, or (3.7) 
p£S{K,-S). (3.8) 

We can also formulate a "zero-error" version such that when p is in such a region, then any 
algorithm for the problem has the option of saying so, but otherwise must answer exactly: 

Definition 14 (Zero-error weak membership problem (WMEM )). Given a rational 
vector p G lR n and rational 5 > 0, assert either that 

p G K, or (3.9) 
p i K, or (3.10) 
p G S(K, 5) \ S(K, —8) (3.11) 

All the above formulations of the quantum separability problem are based on the Euclidean 
norm and use the isomorphism between Mm,n and R M2Ar2 . We could also make similar formu- 
lations based on other operator norms in Mm,n- in the next section, we will see yet another 
formulation of an entirely different flavour. While each formulation is slightly different, they 
all have the property that in the limit as the error parameter approaches 0, the problem coin- 
cides with EXACT QSEP. Thus, despite the apparent inequivalence of these formulations, we 
recognise that they all basically do the same job. In fact, WMEM(5m,jv), WMEMi n (<Sjvf,jv)) 
WMEM 0u t('5Af,Ar), and WMEM(«Sm,jv)° are equivalent: given an algorithm for one of the 
problems, one can solve an instance (p, 5) of any of the other three problems by just calling 
the given algorithm at most twice (with various parameters). 6 

6 To show this equivalence, it suffices to show that given an algorithm for WMEM (Sm, n), 
one can solve WMEMout (<?M n) with one call to the given algorithm (the converse is trivial); 
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3.2 Computational complexity 



This section addresses how the quantum separability problem fits into the framework of 
complexity theory. I assume the reader is familiar with concepts such as problem, instance 
(of a problem), (reasonable, binary) encodings, polynomially relatedness, size (of an instance), 
(deterministic and nondeterministic ) Turing machine, and polynomial-time algorithm; all of 
which can be found in any of 

Generally, the weak membership problem is defined for a class /C of convex sets. For 
example, in the case of WMEM(iS>m,tv), this class is {Sm,n}m,n for all integers M and N such 
that 2 < M < N. An instance of WMEM thus includes the specification of a member K of 
/C. The size of an instance must take into account the size (K) of the encoding of K. It is 
reasonable that (K) > n when K e W 1 , because an algorithm for the problem should be able 
to work efficiently 7 with points in IR n . But the complexity of K matters, too. For example, if 
K extends (doubly-exponentially) far from the origin (but contains the origin) then K may 
contain points that require large amounts of precision to represent; again, an algorithm for 
the problem should be able to work with such points efficiently (for example, it should be 
able to add such a point and a point close to the origin, and store the result efficiently). In 
the case of WMEM(iS>m,tv), the size of the encoding of Sm,n m ay be taken as N (assuming 
M < N), as «Sm,jv is n °t unreasonably long or unreasonably thin: it is contained in the unit 
sphere in ^ m2n2 ~ 1 an d contains a ball of separable states of radius f2(l/poly(A0) ( see Section 
II. 2j) . Thus, the total size of an instance of WMEM(«Sm,;v), or any formulation of the quantum 
separability problem, may also be taken to be N plus the size of the encoding of (p, S) . 



3.2.1 Review of NP-completeness 

Complexity theory, and, particularly, the theory of NP-completeness, pertains to decision 
problems - problems that pose a yes/no question. Let II be a decision problem. Denote 
by Dn the set of instances of II, and denote the yes-instances of II by Yn- Recall that the 
complexity class P (respectively, NP) is the set of all problems the can be decided by a 

a similar proof shows that one can solve WMEMi n (5A/,Ar) with one call to the algorithm for 
WMEM(<Sm,./v)- The other relationships follow immediately. Let (p,5) be the given instance of 
WMEM 0u t(5 Afi 7v). Define p := p + 6(p - I M ,n)/2 and S := 5/(2y/MN(MN - 1)). Call the 
algorithm for WMEM(5j\/,aO with input (po,8o)- Suppose the algorithm asserts po ^ S(Sm,n, —So)- 
Then, because \\p — poll = |||p — ^m,tv|| and \\p — Im,n\\ < b we have p ^ S(Sm,n, —(do + <V2)) 
hence p ^ S(Sm,n, —$)■ Otherwise, suppose the algorithm asserts po £ S(Sm,n,$o)- By way of 
contradiction, assume that p is entangled. But then, by convexity of Sm,n and the fact that Sm,n 
contains the ball B(I m ,n, l/y/MN(MN - 1)), we can derive that the ball B(po,5o) does not inter- 
sect Sm,n- But this implies po ^ S(Sm n, 5q) - a contradiction. Thus, p E 5a/,v- This proof is a 
slight modification of the argument given in |68j |. 

7 Recall that "efficiently" means "in time that is upper-bounded by a polynomial in the size of an 
instance" (the same polynomial for all instances). 
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deterministic Turing machine (respectively, nondeterministic Turing machine) in polynomial 
time. The following equivalent definition of NP is perhaps more intuitive: 

Definition 15 (NP). A decision problem II is in NP if there exists a deterministic Turing 
machine Tn such that for every instance / G Yn there exists a string Cj of length \Cj\ G 
0(poly(|/|)) such that Tn, with input Cj, can check that / is in Yn in time 0(poly(| J|)). 

The string Cj is called a (succinct) certificate. Let Il c be the complementary problem of II, 
i.e. D nc = D n and Y n = := D n \ Y n . The class co-NP is thus defined as {IP : II G NP}. 

Let us briefly review the different notions of "polynomial-time reduction" from one problem 
II' to another II. Let On be an oracle, or black-boxed subroutine, for solving II, to which 
we assign unit complexity cost. A (polynomial-time) Turing reduction from II' to II is any 
polynomial-time algorithm for II' that makes calls to On- Write II' <t II if II' is Turing- 
reducible to II. A polynomial-time transformation, or Karp reduction, from II' to IT is a Turing 
reduction from II' to LT in which On is called at most once and at the end of the reduction 
algorithm, so that the answer given by On is the answer to the given instance of IT. 8 Write 
IT <k II if IT is Karp-reducible to LT. Karp and Turing reductions are on the extreme ends 
of a spectrum of polynomial-time reductions; see jzjj for a comparison of several of them. 

Reductions between problems are a way of determining how hard one problem is relative 
to another. The notion of NP-completeness is meant to define the hardest problems in NP. 
We can define NP-completeness with respect to any polynomial-time reduction; we define 
Karp-NP-completeness and Turing-NP- completeness: 

NPC K := {n 6 NP : IP < K II for all IT G NP } (3.12) 
NPC T := {n G NP : IT" < T II for all If G NP }. (3.13) 

We have NPCk Q NPCt- Let LT, LT', and LT" be problems in NP, and, furthermore, suppose 

IT is in NPCk- If IT <t II, then, in a sense, II is at least as hard as II' (which gives an 

interpretation of the symbol "<t")- Suppose II' < T II but suppose also that IT' ^ K II. If 

IT <k II", then we can say that "II" is at least as hard as II", because, to solve II' (and 

thus any other problem in NP), On has to be used at least as many times as On"', if any 

Turing reduction proving LT' <t II requires more than one call to On, then we can say "II" is 

harder than II". Therefore, if NPCk 7^ NPCt, then the problems in NPCk are harder than 

the problems in NPCt \ NPCk; thus NPCk are the hardest problems in NP (with respect to 

polynomial-time reductions). 

A problem II is NP-hard when II' <t II for some Karp-NP-complete problem II' G NPCk. 

The term "NP-hard" is also used for problems other than decision problems. For example, 

8 In other words, a Karp reduction from II' to II is a polynomial-time algorithm that (under a 
reasonable encoding) takes as input an (encoding of an) instance I' of II' and outputs an (encoding 
of an) instance I of II such that I 1 G Yn' / € Yn . 
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let IT G NPCk; then WMEM(Sm,n) is NP-hard if there exists a polynomial-time algorithm 
for IT that calls O WM em(s m ,n)- 

3.2.2 Quantum separability problem in NP 

Fact [U suggests that the quantum separability problem is ostensibly in NP: a nondeter- 
ministic Turing machine guesses 

{{pi, [K A >], [\^f)})}fl 2 i N2 , 9 and then easily checks that 

M 2 N' 2 

[P] = E PiW)W\] ® [k?>][W|]- (3-14) 
1=1 

Hulpke and Brufi j^l have demonstrated another hypothetical guess-and-check procedure that 
does not involve the numbers pi. They noticed that, given the vectors {[1^)], [|V'?)]}i^i iV2 ) 
one can check that 



{[K>][(^ A |] ® [\i>?)][(^?\]}f=T is affinely independent; and (3.15) 



[p] G conv{[|^ A )][<^ A |] ® [\^)][{^\]}f = T (3.16) 

in polynomially many arithmetic operations. 

Membership in NP is only defined for decision problems. Since none of the weak member- 
ship formulations of the quantum separability problem can be rephrased as decision problems 
(because problem instances corresponding to states near the boundary of Sm,n ca n satisfy 
both possible answers), we cannot consider their membership in NP. However, EXACT QSEP 
is a decision problem. 

Problem 3. Is EXACT QSEP in NP? 

Hulpke and Brufi have formalised some important notions related to this problem. They show 
that if p G S(Sm,n, — 8), f° r some 5 > 0, then each of the extreme points Xj G Sm,n in the 
expression p = Y^iLi* Vi x i can be replaced by Xi, where [fj] has rational entries. This is 
possible because the extreme points (pure product states) of Sm,n with rational entries are 
dense in the set of all extreme points of Sm,n- However, when p £ S(Sm,n, —8), then this 
argument breaks down. For example, when p has full rank and is on the boundary of Sm,n, 
then "sliding" Xi to a rational position Xi might cause Xi to be outside of the afline space 
generated by {xi}i=\,...,k- Figure 13.11 illustrates this in IR 3 . Furthermore, even if Xj can be 
nudged comfortably to a rational Xi, one would have to prove that < Xi >G 0(poly(< [p] >)), 
where < X > is the size of the encoding of X. 

9 As usual, I use square brackets to denote a matrix with respect to the standard basis. 
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Figure 3.1: The dashed triangle outlines the convex hull of x\, X2, and x%, shown as dots at the 
triangle's vertices. This convex hull contains p, shown as a dot inside the triangle, and forms a 
(schematic) facet of Sm,n- The curves represent the allowable choices for the Xj. Sliding any of the 
Xi takes convjxi, X2, 23} outside of the facet. 

So, either the definition of NP does not apply (for weak membership formulations), or we 
possibly run into problems near the boundary of Sm,n (for exact formulations). Below we 
give an alternative formulation that is in NP; we will refer to this problem as QSEP. The 
definition of QSEP is just a precise formulation of the question "Given a density operator 
p, does there exist a separable density operator a that is close to p?" We must choose a 
guess-and-check procedure on which to base QSEP. Because I want to prove that QSEP is 
NP-hard, it is easier to choose the procedure which has the less complex check (but the larger 
guess) . 

Definition 16 (QSEP). Given a rational density matrix [p] of dimension MN-by-MN, and 
positive rational numbers 5 P , e' and 5'; does there exist a distribution {{pu a«, (3i)}i=i,2,...,M 2 N 2 
of unnormalised pure states oti G C , $ G where pi > 0, and pi and all elements of on 
and fa are [log 2 (l/5 p )]-bit numbers (complex elements are x + iy, x,y G 1R; where x and y 
are [log 2 (l/<5 p )]-bit numbers) such that 

M 2 N 2 

1 1 - ini 2 iiah 2 Yl pj\ < e ' foraiH ( 3 - i? ) 
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and 

||[p]-a||^:=tr(([p]-a) 2 )<5' 2 , (3.18) 
where a := YaLi* Pi&M ® hPp- 

Note that these checks can be done exactly in polynomial-time, as they only involve elemen- 
tary arithmetic operations on rational numbers. To reconcile this definition with the above 
intuition, we define a as the separable density matrix that is the "normalised version" of a: 

M 2 N 2 

a := PiaA®Mi, (3-19) 
i=i 

where pi := Pi/YliPii &i := ^i/ll^ill) an d A := A/IIAII- Using the triangle inequality, we can 
derive that 

||*-*|| a < ^^l 1 " INHlAII^Pil, (3-20) 

» 3 

where the righthand side is less than e' when (13.17)1 is satisfied. If ()3.18j) is also satisfied, then 
we have 

Ilb]-<7|| 2 < \M-vh+\\v-v\\2<5' + (3.21) 

which says that the given [p] is no further than 5' + e' away from a separable density matrix 
(in Euclidean norm). 10 

The decision problem QSEP is trivially in NP, as a nondeterministic Turing machine need 
only guess the [log 2 (l/<5 p )]-bit distribution {(p«; a«, A)}j=i,2,...,M 2 7v 2 an d verify (in polytime) 
that (|3.17j) and (|3.18)1 are satisfied. 

3.2.3 NP-Hardness 

Gurvits Jf}?! has shown the weak membership problem for iSa/,v to be NP-hard with respect 
to the complexity-measure (N+ < [p] > + < 5 >). He demonstrates a Turing- reduction from 
PARTITION and makes use of the very powerful Yudin-Nemirovskii theorem (Theorem 4.3.2 

mQ). 

We check now that QSEP is NP-hard, by way of a Karp-reduction from WMEM(iSm,tv)- 
We assume we are given an instance / := ([/?], 5) of WMEM(Sm,n) and we seek an instance 

10 I have formulated these checks to avoid division; this makes the error analysis of the next section 
simpler. 
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I' := ([p'], S p , e', 5') of QSEP such that if I' is a "yes" -instance of QSEP, then I satisfies (13. 3 j) ; 
otherwise I satisfies ()3.4jl . It suffices to use [p'] = [p]. It is clear that if 5' and e' are chosen such 
that 5 > 5' + e', then I' is a "yes" -instance only if / satisfies (13. 3j) . For the other implication, 
we need to bound the propagation of some truncation-errors. Let p := |~log 2 (l/<5 p )~|. 

Recall how absolute errors accumulate when multiplying and adding numbers. Let x = 
x + A x and y = y + A y where x, y, x, y, A x , and A y are all real numbers. Then we have 

xy = xy + xA y + yA x + A x A y (3.22) 
x + y = x + y + A x + A y . (3.23) 

For |y| < 1, because we will be dealing with summations of products with errors, it is 
sometimes convenient just to use 

\xy - xy\ < \A y \ + \A X \ + max{|A a ,|, \A y \} (3.24) 

to obtain our cumulative errors (which do not need to be tight to show NP-hardness). For 
example, if x and y are the p-bit truncations of x and y, where |x|, \y\ < 1, then \A X \, \A y \ < 
2~ v \ thus a conservative bound on the error of xy is 

\xy - xy\ < \A y \ + |A,.| + \A X \ = 3\A X \ < 2 2 \A X \ = 2~ (p " 2) . 

Proposition 6. Let a 6 Sm,n be such that o = z2i=i Pi a i a l ® > an d ^ 
{(pi] ai,/3i)} i= i i2 ,...,M 2 v 2 be the p- bit truncation of {fa) a h A)}i=i,2,...,M 2 JV 2 - 
Then \ \a - a\\ 2 < M 3 N 3 2~( p - 7 - 5) , where 

M 2 N 2 

a := PiaM®PiPl (3-25) 
i=i 

Proof. Letting 7^ := Pi^al <g> A/?] — Pi«iaJ <g> we use the triangle inequality to get 

It suffices to bound the absolute error on the elements of [pjaja^®/^/^]; using our conservative 
rule (|3.24jl . these elements have absolute error less than 2~ tj '~ 1 \ Thus [ji] is an MN-bj-MN 
matrix with elements no larger than 2~( p ~ 7 ) in absolute value. It follows that (tr(7 l 2 )) 1 / 2 is no 
larger than \J MN2~( p ~ 7 ^ in absolute value. Finally, we get 

Ik - a\\ 2 < \/ tr (X 2 ) ^ M 3 N 3 2^ p ~ 7 - 5 \ (3.27) 
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□ 

Proposition 7. Let a be as in Proposition® Then for all i = 1, 2, . . . M 2 N 2 

M 2 N 2 

|1 " INHIAII 2 Pi\ < M3iv3 2" (p_5) - (3.28) 

Proof. The absolute error on YljPj * s M 2 N 2 2~ P . The absolute error on ||aj|| 2 (resp. ||/3i|| 2 ) 
is no more than M2~( p ~ 3 ) (resp. A r 2~( p ~ 3 ^). This gives total absolute error of 

|1 - INHIAII 2 ^^'! < M 3 N 3 2^ P ~ 5 \ (3.29) 
j 

□ 

Let 5' := M 3 iV 3 2-(P- 8 ) and e' := M 3 iV 3 2^^ 5 ) and set p such that e' + 5' < 5. Suppose 
there exists a separable density matrix a such that | \[p] — a\ [2 = 0. Then Propositions El and U\ 
say that there exists a certificate a such that (j3.17)l and (j3.18j) are satisfied. Therefore, if I' is 
a "no" -instance, then for all separable density matrices cr, ||[p] — cr 1 12 > 0; which implies that 
/ satisfies f)3.4|l . I have exhibited a polytime Karp-reduction from WMEM(S m,n) to QSEP 
(actually, from WMEM In (5 A /,iv) to QSEP). 

Fact 8. QSEP is in NPC T . 



3.2.4 Towards a Karp Reduction 

To date, every decision problem (except for QSEP) that is in NPCt is also known to be in 
NPC K H- 

While it is strongly suspected that Karp and Turing reductions are inequivalent 
within NP, it would be very strange if QSEP, or some other formulation of the quantum sepa- 
rability problem, 11 is the first example that proves this inequivalence. We have an interesting 
open problem: 

Problem 4. Is QSEP in NPC K ? 

Note that, because of FactlHl a negative answer to this problem implies that P 7^ NP. Thus it 
might be safer to work under the assumption that the answer is positive, and look for a Karp 
reduction from some Karp-NP-complete problem to some formulation IIqsep of the quantum 
separability problem. 

11 By "formulation of the quantum separability problem", I mean an approximate formulation that 
tends to EXACT QSEP as the accuracy parameters of the problem tend to zero. 
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Technically, WMEM(«Sm,jv) is n °t i n NP because it is not a decision problem. But the 
definition of "NP" can be modified to accommodate such weakened problems having overlap- 
ping decisions According to this different definition, WMEM(«Sm,jv) is in "NP". 12 We can 
pose the following open problem, related to the one above. 

Problem 5. Does there exist a Karp reduction from some Karp-NP-complete problem to 
WMEM(<Sa/,tv)? 

Finding a positive answer to this problem implies a positive answer for Problem 0] Alterna- 
tively, finding a negative answer to this problem does not, technically, imply that P ^ NP, so 
may not win the million-dollar prize. 



3.2.5 Nonmembership in co-NP 

Is either EXACT QSEP or QSEP in co-NP? To avoid possible technicalities, we might 
first consider the presumably easier question of whether WMEM(Sm,n) is in "co-NP": Does 
every entangled state p ^ S(Sm,n, 5) have a succinct certificate of not being in S(Sm,n, — <5)? 
It may or may not be the case that P equals NPflco-NP, but a problem's membership in 
NPflco-NP can be "regarded as suggesting" that the problem is in P [7^]. Thus, we might 
believe that WMEM(«S m ,jv) is not in "co-NP" (since WMEM(«S m ,jv) is NP-hard). 

Let us consider this with regard to entanglement witnesses (which are candidates for 
succinct certificates of entanglement). We know that every entangled state has a (right) 
entanglement witness A G Hm,jv that detects it. However, it follows from the NP-hardness 
of WMEM(<Sjvf i jv) an d Theorem 4.4.4 in Q] that the weak validity problem for K = Sm,n 
(WVAL(S m ,n)) is NP-hard: 13 

Definition 17 (Weak validity problem (WVAL)). Given a rational vector c G IR n , a 
rational number 7, and rational e > 0, assert either that 

c T x < 7 + e for all x G K, or (3.30) 
c T x > 7 — e for some x G K. (3.31) 

So there is no known way to check efficiently that a hyperplane %A,b separates p from Sm,n 
(given just the hyperplane); thus, an entanglement witness alone does not serve as a succinct 

12 For the weak membership problem, WMEM(fT) is in "NP" if and only if for all points p G 
S(K,—5) there exists a succinct certificate of the fact that p G S(K,S). According to |6Q|. any 
p G S(Sm,n,—$) is m the convex hull of M 2 N 2 afflnely independent elements of a dense set of 
pure product states generated by rationals. By possibly tweaking each element, we can choose the 
rational numbers to have denominators no bigger than poly(M, N)/8, so we can perform the checks 
in HM.15|) and IjH.lfiJI efficiently, to conclude that p G S(Sm,n,S). 

13 Theorem 4.4.4 in [jj, applied to Sm,n, states that there exists an oracle-polynomial-time algo- 
rithm that solves the WSEP(5jv-/,at) given an oracle for WVAL(5/v/,tv)- 
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certificate of a state's entanglement unless WVAL(iSm,aO is in P. However, one could imagine 
that there is a succinct certificate of the fact that a hyperplane 7Ta,& separates p from Sm,n- If 
such a certificate exists, then WVAL(«S M ,iv) is in "NP" and WMEM{S m ,n) is in "co-NP". 14 
With regard to QSEP, we can prove the following: 

Fact 9. QSEP is not in co-NP, unless NP equals co-NP. 



This fact follows from the general theorem below 




Theorem 10. IfTL is in NPCt and IT is in co-NP, then NP equals co-NP. 

Proof. Since II is in co-NP, Il c is in NP. Let IT' be any problem in co-NP. To show that 
co-NP equals NP, it suffices to show that co-NP is contained in NP; thus, it suffices to show 
that IT is in NP. The following reduction chain holds, since IT /C is in NP: II' <t IT c <t II. 
Because both IT and Il c are in NP, the reduction IT' <t II can be carried out by a polytime 
nondeterministic Turing machine, which can "solve" any query to On by nondeterministically 
guessing and checking in polynomial-time the "yes" -certificate (if the query is a "yes" -instance 
of IT) or the "no" -certificate (if the query is a "no" -instance of IT). Thus IT' is in NP. □ 

It is strongly conjectured that NP and co-NP are different 0], thus we might believe that 
QSEP is not in co-NP. 15 



3.3 Survey of algorithms for the quantum separability 
problem 

I concentrate on proposed algorithms that solve an approximate formulation of the quan- 
tum separability problem and have (currently known) asymptotic analytic bounds on their 
running times. For this reason, the SDP relaxation algorithm of Eisert et al. is not men- 
tioned here (see Section H. 3. 2|) : though, I do not mean to suggest that in practice it could not 
outperform the following algorithms on typical instances. As well, I do not analyse the com- 
plexity of the naive implementation of every necessary and sufficient criterion for separability, 
as it is assumed that this would yield algorithms of higher complexity than the following 
algorithms. 16 

14 WVAL(K) is in "NP" means that for any c, 7, e satisfying c T x < 7 — e for all x € K, there 
exists a succinct certificate of the fact that ([3.30)1 holds. 

15 We would like to be able to use Fact El to show that WVAL(S m ,n) is not in "NP" unless NP 
equals co-NP. However, for this, we would require that "WVAL(5a/,tv) is in NP only if QSEP is in 
co-NP"; but this is not the case (only the converse holds). 

16 For an exhaustive list of all such criteria, see the forthcoming book by Bengtsson and Zyczkowski 

& 
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The main purpose below is to get a time-complexity estimate in terms of the parameters 
M, N, and 8, where 8 is the accuracy parameter in WMEM(<Sjvf,jv)- In the following, the only 
way precision and error are dealt with is similar to the above discussion, where we have a 
truncation-error resulting from approximating the continuum of pure product states by a finite 
set of finitely precise product vectors. The running-time estimates are based on the number of 
elementary arithmetic operations and do not attempt to deal with computer round-off error; 
I do not give estimates on the total amount of machine precision required. Instead, where 
rounding is necessary in order to avoid exponential blow-up of the representation of numbers 
during the computation, I assume that the working precision 17 can be set large enough that 
the overall effect of the round-off error on the final answer is either much smaller than 8 or 
no larger than, say, 8/2 (so that doubling 8 takes care of the error due to round-off). 

3.3.1 Search for separable decompositions 

The most naive algorithm for any problem in NP consists of a search through all potential 
succinct certificates that the given problem instance is a "yes" -instance. Thus QSEP immedi- 
ately gives an algorithm for the quantum separability problem. However, we can, in principle, 
reformulate QSEP to incorporate the ideas of Hulpke and Brufi 0] in order to get a better 
algorithm. 

The algorithm of Hulpke and Brufi 

First, let us see how to perform the checks in lines (|3.15j) and 1)3. 16)) . Using simpler 
notation, suppose we are given {xi : % — 1, 2, . . . , A;} C M n . This set is affinely independent if 
and only if {xi — x\ : % = 2, . . . , k} is linearly independent. Thus Gaussian elimination can be 
used to test for affine independence. Suppose {x^ : i = 1, 2, . . . , n+1} is affinely independent. 
Then the X{ form the extreme points of the polytope conv{xj : i = 1, 2,...,n + l}. Consider 
the facet of this polytope that does not contain Xj, and choose some x\ ^ Xj in the facet. 
The normal Vj to this facet is orthogonal to a?j — x\, for all % ^ j,l, and is thus the generator 
of the nullspace of the matrix whose n — 1 rows are the vectors Xi — x\. Again, Gaussian 
elimination can be used to solve for Vj. A point p is in the polytope if and only if, for all 
j = 1, 2, . . . , n + 1, the halfspace {x : vjx < vjxi} contains both or neither of p and Xj] that 
is, both p and Xj are on the "same side" of the hyperplane {x : vjx = vjxi) corresponding 
to the facet not containing Xj. 

The algorithm of Hulpke and BruB is basically a loop through all possible affinely inde- 
pendent sets X of pure product states, with the check for whether convA contains the given 
state p. However, the algorithm uses unbounded precision and performs its calculations to 

17 "Working precision" is defined as the number of significant digits the computer uses to represent 
numbers during the computation. 
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arbitrarily high precision so that it attempts to find such (arbitrarily precise) X for p £ Sm,n 
that are arbitrarily close to the boundary of Sm,n] it may even find such X for p £ Sm,n 
that are on the boundary of the "cone" of positive Hermitian operators and hence on the 
boundary of Sm,n- The algorithm only relaxes and solves the weak membership problem for 
states p £ Sm,n that are on the boundary between separable and entangled states. As argued 
at the beginning of this chapter, we are satisfied with an algorithm for the weak membership 
problem for all states. Thus we will formulate an approximate version of this algorithm whose 
precision requirements for the X are bounded by M, N, and 5. 18 

Reformulation of QSEP 

Recall the mapping v : M M ,7v -> M. 1 defined in flTT^l) on page 1211 

Definition 18 (QSEP'). Given a rational density matrix [p] of dimension MN-bj-MN, and 
positive rational numbers 6 P and e'; does there exist a set {(<Sj, /9j)}i=i ) 2,...,M 2 Af 2 of unnormalised 
pure states <Sj £ C M , $i £ where all elements of &i and $ are [log 2 (l/5 p )]-bit numbers 
(complex elements are x + iy, x,y £ R; where x and y are [log 2 (l/5 p )]-bit numbers) such 
that 19 

|1 - ||ai|| 2 ||A|| 2 | < e for alii (3.32) 

and 

{^(ajQiJ (g) is affinely independent (3.33) 

and 

[p] £ StconvMaiSt ® M\)}i, 0? ( 3 - 34 ) 

Note that (J3.32J) ensures that <8> is e'-close to an actual state &iOi\ ® where 
&i := ai/||a|| and $ := The check in line f!3.34j) is an easy modification of the check 

18 The full algorithm of Hulpke and Brufi is the parallel combination of the algorithm of Doherty 
et al. and this search for an X, along with a check for the case when p is r/-close to the boundary 
between separable and entangled states. 

19 Because I am ignoring round-off error, I assume that the function v can be computed exactly, 
even though the elements X{ of B have square-root symbols appearing in them. (Because the com- 
putations required for the check are relatively simple, it might be possible to carry these irrationals 
symbolically through most of the computation, only requiring an approximation of them near the 
end when computing the normal to a hyperplane and checking the distance from various points to 
a hyperplane.) I wanted to avoid such an assumption in the proof of NP-hardness of QSEP. It will 
be become clear, though, that QSEP' - with the v(dia\ <g> A/?J) truncated - could also be shown to 
be NP-hard with a suitable truncation-error analysis. 
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described in the previous subsection. Let p := [log 2 (l/5 p )] . 

Suppose that, for some a G Sm,n, o~ £ convjajO^ ® PiPi}i=[ N2 ~ l for normalised pure 
states ctj G C M and $ G C . Let dj and fa be the p-bit truncations of oti and A, and let 
7i := ttjaj <g> — c^a] ® A/3/. The rectangular coordinates of the entries in pyj are no bigger 
than 2~^ 6 ). It follows that y/tr(^) is not larger than MN2~ { p~ 6 ^: 

\\ ai a\ ® - ® M| | < MA^2-( p - 6 ' 5 ). (3.35) 

Thus, setting e' := MN2~^-^ and setting p such that 2e' < 5, it follows that QSEP' solves 
WMEM(iS>m,tv) with accuracy parameter 5. This gives 

p > \og 2 (2MN/S) + 7. (3.36) 



Therefore, to solve WMEM(5 A /,jv), it suffices to loop through all (M 2 iV 2 )-subsets of 
\log 2 {2MN/5) + 7] -bit unnormalised pure product states, checking the three conditions in 
QSEP'. Define Q p as the number of p-bit unnormalised pure product states resulting from the 
truncation (to p bits) of all normalised pure product states. The complexity of this algorithm 
is 



n 



\\og 2 (2MN/S)+T] 

M 2 N 2 



poly(M,JV,log(l/5)). (3.37) 



Since the pure product states can be parametrised by 2(M + N) —4 real parameters, we have 
the estimate 

Combined with the estimate I ] ~ n k , we get a rough asymptotic complexity estimate for 



the algorithm of 



k 



2 6 - 5 MN\ 2 ( m3n2 + m2n3 )~ 4m2n2 



poly(M,JV,log(l/5)). (3.39) 



In the interest of getting a rough lower bound on the complexity of this algorithm, I have 
underestimated f2 p . The number 2 p ( 2 ( M+Ar ) -4 ) corresponds to the number of different p-bit 
settings of the 2(M + N) —4 angles (phases and amplitudes) that parametrise the normalised 
pure product states. The truncation-error analysis was done with respect to rectangular 
coordinates, so this method of generating the elements &j <g> f3j may miss some elements that 
would have resulted from a p-bit truncation of rectangular coordinates of normalised pure 
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product states. On the other hand, if we use all p-bit settings of the 2(M + N) rectangular 
coordinates to generate elements aj<S>/3j, then many of the elements generated will not satisfy 
|l ~~ I l^jl H 1/3? I H < € ' ■ The most efficient way to systematically generate the elements ctj ® j3j 
is left as an open problem: 

Problem 6. What is the most efficient way to generate the jth element <x,- ® (3j of the set 
of Q p unnormalised pure product states resulting from the p-bit truncation of all normalised 
pure product states? 

We take the algorithm of this section as the best exhaustive search approach to solving 
the approximate quantum separability problem. For example, it is better than searching all 
of iSjv/,tv in order to calculate E d 2 (p) of Section 11.3.21 and it is better than searching all pure 
decompositions of p in order to calculate E' F (p) of Section H. 3. 31 



3.3.2 Bounded search for symmetric extensions 

In Section 11.3. lj. we considered two tests - one that searches for symmetric extensions of 
p, and a stronger one that searches for PPT symmetric extensions. Now we continue that 
exposition, showing that recent results can put an upper bound on the number k of copies 
of subsystem A when solving an approximate formulation of the separability problem. The 
bound only assumes symmetric extensions, not PPT symmetric extensions, so it is possible 
that a better bound may be found for the stronger test. 

If a symmetric state g G D((C d )® n ) has a symmetric extension to V((C d )®( n+m >) for all 
m > 0, then it is called (infinitely) exchangeable. The quantum de Finetti theorem 20 says that 
the infinitely exchangeable state g is separable. Recalling the terminology of Section 11.3.11 
it is also possible to derive that, for p G TJ(C M ® C N ), if there exists a symmetric extension 
of p to k copies of subsystem A for all k > 0, then p G Sm,n- This is the result that proves 
that Doherty et al.'s hierarchy of tests is complete: if p is entangled, then the SDP at some 
level ko of the hierarchy will not be feasible (i.e. will not find a symmetric extension of p to 
k copies of subsystem A). Konig and Renner j^] derived quite general results about states 
p that have symmetric extensions to k copies of subsystem A. Their results give us our upper 
bound on k. 

The upper bound follows directly from the main theorem in . The result is too technical 
to summarise meaningfully without diverging from the aim of this thesis. We require the 
following corollary: 

Theorem 11 (Corollary of Theorem 6.1 in jzsj). Suppose p G £>ju,iv and there exists a 
20 References for material in this paragraph may be found in [3lj ] . 
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symmetric extension of p to k > 2 copies of subsystem A. Then 



4M 6 

tr\p-a\<-y==, (3.40) 



for some a G 5m. 



jv • 



The proof of this theorem is similar to the proof of Corollary 6.2 in [75[. Note that the result 
uses the trace distance, tx\X — Y\, between two operators X and Y. Let us assume we are 
solving the weak membership formulation of the quantum separability problem with respect 



to the trace distance, and with accuracy parameter 5. Then, setting 5 = 4M 6 /a//c — 1, we 
get the following upper bound for k: 

Corollary 12. To solve WMEM(Sm,n) (with respect to the trace distance) with accuracy 
parameter 5 by searching for symmetric extensions ( as described in Section U.S. 1)) . it suffices 
to look for symmetric extensions to 

k := \lQM 12 /5 2 + 1] (3.41) 

copies of subsystem A. 

To estimate the total complexity of the algorithm, note that 

- [(M-1) + *1[(M- 2) + .,... [(!) + *] > kM - 1/(M _ 1)L (3 . 42) 

Substituting k for k, we get 

/ 16M n \ M_1 , s 

ds k > {-^) ■ ( 3 - 43 ) 

Just to solve the first constraint in (jl.fi)) requires \fn (but usually far fewer) iterations of a 
procedure that requires 0(m 2 n 2 ) arithmetic operations, for m = (d 2 s _—M 2 )N 2 and n = d 2 s ^N 2 . 

Problem 7. Can the upper bound k be improved by taking into consideration the PPT 
constraints in (jl.6|) ? 



Despite this unattractive worst-case bound, the hierarchy of tests has proved to be efficient 
in practice for confirming that certain states are entangled (i.e. small k suffices). 
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3.3.3 Cross- norm criterion via linear programming 

Rudolph j3] derived a simple characterisation of separable states in terms of a computa- 
tionally complex operator norm || • || 7 . 21 For a finite-dimensional vector space V, let T(V) be 
the class of all linear operators on V. The norm is defined on T(C M ) <8> T(C N ) as 

k k 
\\t\\ y := inf{y^ ||ttj| lil \Vj\ |i : t = ^ttj <8>Vj}, (3.44) 

where the infimum is taken over all decompositions of t into finite summations of elementary 
tensors, and ||X||i := tr(V X^X). Rudolph showed that ||p|| 7 < 1 if and only if ||p|| 7 = 1, 
and that a state p is separable if and only if ||p|| 7 = 1. 

Perez-Garcia [77I showed that approximately computing this norm can be reduced to a 
linear program (which is a special case of a semidefinite program): min{c T x : Ax = b, x > 0}, 
where A £ ]R nxm , b £ IR n , c £ IR m , and x is a vector of m real variables; here, x > means 
that all entries in the vector are nonnegative. An LP can be solved in 0(m 3 L') arithmetic 
operations, where U is the length of the binary encoding of the LP j^. The linear program 
has on the order of M 2 N 2 variables and M 2M N 2N (2k) 2( - M+NS) constraints, where k is an integer 
that determines the relative error 22 (k/(k — l)) 4 — 1 on the computation of the norm. Thus 
it may be solved in 

0(M 2M + 2 Ar 27V +2(2A , ) 2(M + V) ) 

arithmetic operations. 

Suppose ||p|| 7 is found to be no greater than 1 + r\. Then, we would like to use r\ to 
upper-bound the distance, with respect to either trace or Euclidean norm, from p to Sm,n- 
Unfortunately, we do not know how to do this. This drawback, along with the fact that the 
error on the computed norm is relative as opposed to absolute, does not allow this algorithm to 
be easily compared to the other algorithms I consider. Still, there may be a way to overcome 
this problem, as follows. 

Following Rudolph 0], a norm closely related to || ■ || 7 is 

k k 

\\t\\ s := inf{y^ IKIHKHi : t = s ^u i ®v i ], (3.46) 

i=l i=l 

where the infimum is taken over all decompositions of t into finite summations of elementary 

21 The mathematical arguments behind the results in this section are nontrivial in that they involve 
notions from operator theory, which are tough-going for the nonexpert (me). Luckily, the results 
themselves can be stated and understood, at least superficially, with relative ease. 

22 The relative error of an approximation x of x is defined as \x — x\/x. 
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Hermitian tensors. This restriction on the decomposition implies that ||t|| 7 < ||t||s; thus, 
if \\p\\s < 1) then p G Sm,n- Conversely, if p G Sm,n, then p = YliiPipt) ® P?> an d this 
decomposition ensures \\p\\s < 1- Thus \\p\\s < 1 if and only if p G Sm,n- The norm || ■ ||s is 
related to an entanglement measure called "robustness" . 

The ro^s of entablement Q of p £ is dedned as 

R(p) := inf{a~ : p = a + a + — a~a~, > 0, G <S/vf,7v}. (3.47) 

In other words, the robustness is (a simple function of) the minimal p, < p < 1, such that 

(7+ =po- - + (1 -p)p (3.48) 

for separable states a ± \ the minimal p is Ph(p) := R(p) I (R(p) + !)■ Thus, R(p) corresponds to 
the minimal amount of separable "noise" (<r~) that must be added to p in order to eliminate 
all the entanglement in p. 



Using properties of "subcross norms" (see references in 12j), Rudolph shows 12j that for 
P e V m ,n 

R(p) = \(\\p\\s-1); (3.49) 

the proof is based on the ideas of "base norm" used in j^ . 

The point is that if we could modify Perez- Garcia' s algorithm so that it approximately 
computes || • \ \$, then we could relate the result to a standard norm, as follows. Suppose the 
algorithm allows us to assert that \\p\\s < 1 + 2r]. Then R(p) < rj. Now, we have 

l|p-^ + ll = \\p R{p) (p-a-)\\ (3.50) 
= PR(p)\\p-°~\\ (3-51) 
R(P) 1P---II (3-52) 



1 + R(j>) 

< g- t (3.53) 

where 2 is a an upper bound on the Euclidean diameter of the set of (normalised) density 
operators (see Figure l5~4l on pagelHUJ). 23 

Problem 8. Can the algorithm of Perez-Garcia be modified so that it approximately com- 
putes the norm 1 1 • | |s? 

Continuing with our hypothetical run-time analysis, how would we assert | \p\ \s < 1 + 2?y? 



23 Actually, from the diagram, we could get a slightly better bound than 2. But, since this discus- 
sion is purely "academic" , it does not matter. 
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The actual algorithm returns an approximation x such that ||p|| 7 < x < (k/(k — 1)) ||p|| 7 . 
Let us assume that a modification of the algorithm which computes \ \p\\s would do the same. 
If the modified algorithm returns a number that is less than 1, then we know that | \p\ \s < 1. 
Otherwise, all that we need is an upper bound Abs on the absolute error of the computation 
of ||p||s, since, if Abs < t], then we can comfortably conclude that either \\p\\s < 1 + 2?7, or 
\\p\\s > 1- Using the canonical basis B of M.m,n described in Section \2.3l we have Max : = 
max pg x> A/JV \ \p\\s G 0{po\j(M, N)), which says the absolute error |H|s((A;/(fc — l)) 4 — 1) is 
upper-bounded by Abs G 0(((k/(k — l)) 4 — l)poly(M, N)). The requirement Abs < r\ leads 
to a lower bound for k of 

Max 1/4 

k > - T7I . 3.54 

(r/ + Max)V4_Max 1/4 

Rudolph [l^l has also shown that, for p G T>m,n, 

R(p) > \\p\U- 1- (3.55) 

If equality holds in equation ()3.55j) . then an argument similar to the one above could be used. 
Rudolph notes that equality holds for pure states and "Werner" and "isotropic" states (see 

3). 

3.3.4 Fixed-point iterative method 



Zapatrin |80j] suggests an iterative method that solves the separability problem. 24 He 
defines the function $ : Wm,n ^m,n' 



$(A):=X + A(p- / /e^1 ^ B l x l^) ^ B )|^ A )^ A |®|^ B )^ B MS M ^), (3.5 



24 Facts about iterative methods: First, the basic Newton-Raphson method in one variable. Sup- 
pose £ is a zero of a function / : M — > M and that / is twice differentiable in a neighbourhood U (£) 
of £. Then the Taylor expansion of / about xq G U{£) gives 

= f(0 = f(x ) + (£ - x )f'(x ) + ■■■ (3.56) 
= /(xo) + (£ - x )f(xo), (3.57) 

where £ = xq — f(xo)/f'(xo) is an approximation of ^. Repeating the process, with a truncated 

Taylor expansion of / about £, gives a different approximation £ = £ — /(C)//' (0- This suggests the 
iterative method xi + \ = $(xj), for <f>(x) := x — f(x)/f'(x). If /'(C) / 0, the sequence (xi)i converges 
to C if is sufficiently close to £■ More generally, if <&(x) : M n — > M n is a contractive mapping on 
B(xo, r), then the sequence (xp, Q(xn), $(^(xo)), . . .) converges to the unique fixed point in B(xq, r) 
(as long as $(xq) £ B(xq,t)) 



25]. 
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where Sm and are the complex origin-centred unit spheres (containing, respectively, |"0 A ) 
and l^ 5 )), and A is a constant dependent on the derivative (with respect to X) of the qua ntity 
in parentheses (A is chosen so that $ is a contraction mapping). In earlier work 0,0,01, 
Zapatrin proves that any state a in the interior S° M N of Sm,n ma y be expressed 



a 




^>^ B N^>k B >|^ A )<^ A | O \i?){^\dS M dS N G S m , n , (3.59) 




for some Hermitian X a . Thus the function <3> has a fixed point X p = <&(X p ) if and only if 
p G Smn- When p G N , then a neighbourhood (containing 0) in the domain of $ can be 
found where iterating X i+1 := starting at X := 0, will produce a sequence (X^i that 

converges to X p when p G S° M N , but diverges otherwise. 

Each evaluation of $(X) requires M 2 N 2 /2 + MN integrations of the form 

; <^K^|*k>k B > <e A |^ A > <e B ,|^ B ) <^ A |e A > <^ B |ef,> dS M dS N: (3.60) 

where {e A }j and {e^}^. are the standard bases for C M and C N . However, the off-diagonal 
(j 7^ fc? j' 7^ fc') integrals have a complex integrand so are each really two real integrals; 
thus the total number of real integrations is M 2 N 2 . Let represent the number of pure 
states at which the integrand needs to be evaluated in order to perform each real numerical 
integration, in order to solve the overall separability problem with accuracy parameter 5. 
Zapatrin shows that the approximate number of iterations required is upper-bounded by 
2-/V(iV+l)L(log(l/5), log(iV)), where L is a bilinear function of its arguments. The complexity 
of the entire algorithm is roughly (ignoring log(iV) factors) 

S 5 poly(M,iV,log(l/(5)). (3.61) 

In numerical integration, the final result of the integration depends on the truncation- 
error at each point at which the integrand is numerically evaluated. This is detrimental to 
the complexity of Zapatrin's algorithm and I just make the reasonable presumption that S<5, 
whatever it is, is far greater than Q p in the other algorithms analysed in this thesis (for 
the same values of M, N, and 5). I do not consider Monte-Carlo integration methods (i.e. 
methods based on random sampling), because randomised algorithms for the separability 
problem are outside our scope. 



Chapter 4 

Reduction to Entanglement Witness 
Search 

In Section E31 we saw four proposed algorithms for solving an approximate formulation of 
the quantum separability problem, all of which have analytically bounded running times. This 
chapter introduces a fifth, which is based on the simple idea of searching for an entanglement 
witness for the given state. In the language of convex body problems set up in Chapter it 
solves the in-biased weak separation problem for K = Sm,n' 

Definition 19 (In-biased weak separation problem (WSEPi n )). Given a rational vector 
p G M. n and rational 5 > 0, either 

• assert p G S(K, 5), or 

• find a rational vector c G M n with | |c| |oo = 1 such that c T x < c T p for every x G K 1 . 

Of the algorithms of the previous chapter, this fifth algorithm is most closely related, in spirit, 
to Zapatrin's algorithm of Section 13.3.41 This is because both algorithms reduce the quan- 
tum separability problem to poly(M, N, log(l/<5)) iterations of a difficult function evaluation: 
in Zapatrin's case, the difficulty is a numerical integration; in the following algorithm, the 
difficulty is the computation of a global maximum. 

4.1 Overview 

In Section 14.21 I explain that the quantum separability problem can be reduced to the 
computation of b* (A) := max ff65M N {tr(Aa)} from Chapter El Recall that exactly computing 
b*(A) corresponds precisely to the strong optimisation problem for K = Sm,n (see Definition^ 

1 The loo norm appears here as a technicality, so that c need not be normalised by a possibly 
irrational multiplier. We will just use the Euclidean norm in what follows and have ||c|| ~ 1. 
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on page l2*T]) . The main algorithm of this thesis (henceforth referred to as "the new algorithm" ) 
is a new polynomial-time reduction from WSEPi n (J^) to the (weak) optimisation problem for 
K\ the algorithm works for any convex set K that satisfies certain conditions - not just 
Sm,n- Section l4~3l explains how such an algorithm can be utilised in an experimental setting 
when faced with the problem of deciding whether an unknown state, of which many copies 
are available, is entangled; such an algorithm can be applied to give a one-sided test for 
separability even when only partial information about the state is available. 

Recall that to solve WSEPi n («SM,jv), in the case where the given state p is entangled, means 
to provide a right entanglement witness that detects p. The new algorithm can be viewed as an 
exhaustive search for an entanglement witness for the given p; and if no entanglement witness 
is found, then the algorithm concludes that p is close to separable. In Section l4~4*t I give the 
basic idea behind the search method employed by the new algorithm. This search method is 
a variant of a well-known method in convex analysis, which I explain in Section 14.51 Both 
search methods yield oracle-polynomial-time reductions of the same asymptotic complexity. 
I discuss the general form of such reductions in Section l4~o1 Indeed, the new algorithm is not 
an improvement on known reductions of its kind. The novelty of the work in this chapter, 
with regard to the quantum information processing community, is the discovery that the best 
known algorithm for the quantum separability problem (in the case M = N) is obtained by 
a reduction to the weak optimisation problem over Sm,n- Section 14771 gives an upper bound 
on the complexity of the weak optimisation problem over Sm,n and Section 14.81 contains the 
comparison of the complexities of the new algorithm and the algorithms of Sections 13.3. II and 
18.8.21 With regard to the convex programming community, the new algorithm (whose details 
are presented in Chapter EJ) is a variant of well-known algorithms which, while perhaps not 
offering any computational advantage, arguably holds intrinsic beauty because it is based on 
a simple, intuitive heuristic (explained in Section l4~4j) . 

4.2 Reduction to optimisation 

Recall the function b*(A) := max CTe s M JV {tr( Ax)} from Chapter 121 This function leads 
naturally to an algorithm for quantum separability as follows. For A e HW,jv such that 
tr(A 2 ) = 1, define the function d p (A) as 

d p (A) :=b*{A)-tr{Ap). (4.1) 

Geometrically, d p (A) is the signed distance from the state p to the hyperplane iTA,b*(A)- It 
follows that p is entangled if and only if there exists an A such that d p (A) < 0. Any algorithm 
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that determines whether the global minimum (over the unit sphere {x G HW.iv : tr(x 2 ) = l}) 2 
of d p (A) is negative thus solves the separability problem. Any such algorithm would need 
a subroutine that approximately computes b* (A) for any A. Since b* (A) is just the global 
maximum of a linear functional tr(Aa) over all a G Sm,Ni we have reduced the approximate 
quantum separability problem to the weak optimisation problem for K = Sm,n' 

Definition 20 (Weak optimisation problem (WOPT)). Given a rational vector eel™ 
and rational e > 0, either 

• find a rational vector y G W 1 such that y G S(K, e) and c T x < c T y + e for every x G K; 
or 

• assert that S(K, — e) is empty. 3 

Theorem 4.4.7 from Q says that WSEP(5 M) jv) <t WOPT(<S m ,at)- Thus, the NP-hardness 
of the quantum separability problem is contained in the hardness of b* (A); that is, WOPT (Sm,n) 
is NP-hard. 

The rest of this thesis develops an oracle-polynomial-time algorithm for WSEPi n (<Sjif,iv) 
assuming an oracle for WOPT (5^/, at), which differs from those already in the literature (as 
fully explained in Section 14. 5 j) . In terms of attempting to find a practical algorithm for 
WSEPih(<Sm,jv), the skeptic notices that such an algorithm may not offer any advantage 
over more direct or naive approaches to solving WSEPi n (iS/v/,7v) : instead of having to solve 
one instance of an NP-hard problem, we now have to solve many! We will see at the end 
of this chapter that the theoretical complexity of such an algorithm compares favourably 
with the others. This is, in part, because the optimisation in b* (A) need only be carried 
out over the extreme points of Sm,n, which are parametrised by only 2(M + N) — 4 (free) 
variables; the entire Sm,n is parametrised by M 2 N 2 — 1 (constrained) variables. From a 
practical point of view, there are many algorithms available for optimising functions - far 
more than for computing the separation problem. Options for computing WOPT (Sm,n) 
include the SDP-relaxation method of Lasserre, as in Section 11.3.21 Lipschitz optimisation 
and Hansen's global optimisation algorithm using interval analysis 0|. I discuss the 
complexity of computing WOPT(«Sm,at) in more detail in Section l4~71 

2 The minimum need only be over the (M 2 N 2 — 2)-dimensional sphere {x G Mm,n '■ tr(x 2 ) = 
l,tr(x) = 0}. As well, as we will see in Section l4.4( we can further restrict to the hemisphere that 
has positive inner product with p. Note that, based solely on the convexity of Sm,n, d p (A) may 
have many local minimisers in this hemisphere. 

3 This will never be the case for us, as Sm,n is not empty. 
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4.3 Detecting Entanglement of an Unknown State Us- 
ing Partial Information 



I now consider the task of trying to decide whether a completely unknown physical state 
p, of which many copies are available, is entangled. For simplicity, we restrict to p G 7^2,2 but 
the discussion can be applied to a bipartite system of any dimension, replacing Pauli operators 
with canonical generators of SU(M) and SU(N) or any orthonormal Hermitian product basis. 
For such p, this problem has already been addressed in ^|, where the so-called "structural 
physical approximation of an unphysical map" [sfi| was used to implement the Peres-Horodecki 
positive partial transpose (PPT) test P, Eo|- While the structural physical approximation is 
experimentally viable in principle, it is very difficult to do so. Thus, the easiest way to test 
for entanglement at present is to perform "state tomography" in order to get good estimates 
of 15 real parameters that define p, then reconstruct the density matrix for p and carry out 
the PPT test on this matrix. 

An experimentalist has many choices of which 15 parameters to estimate: the expectations 
of any 15 linearly independent observables qualify, as do the probability distributions of any 
5 mutually unbiased (four-outcome) measurements 0, Ssj]- Whatever 15 parameters are 
chosen, we assume that the basic tool of the experimentalist is the ability to perform local 
two-outcome measurements on each qubit, e.g. measuring o\ on the first qubit and <Ji on the 
second. Under this assumption, the scenario where the two qubits of p are far apart is easily 
handled if classical communication is allowed between the two labs. We further assume, for 
simplicity, that the set of these local two-outcome measurements is the set of Pauli operators 
{<7j} 1=0,1,2,3 (defined on page ED} • If en is measured on the first qubit and aj on the second, 
repeating this procedure on many copies of p gives good estimations of the three expectations 
((7j ® (To), (uo <g>CTj), and (cr, <8>0j) (where the subscript "p" is omitted for readability). Let us 
call this procedure measuring a^aj. 

Suppose the experimentalist sets out to solve our problem and begins the data collection 
by measuring 0\<J\ and then 0202. Even though only 6 of the 15 independent parameters 
defining p have been found, the example in Section 12.2.31 shows that p is entangled if one 
of the four inequalities (|2.11|) is true. It is straightforward to show that if none of these 
inequalities is true, then no entanglement witness in the span of \o\ ® o\ , 02 <8> 02 } can detect 
p if it is entangled. 4 However, there may be an entanglement witness in the span of 

{o- (g> CTi, cr ® cr 2 , 0i ® 01, 2 ® 02, 01 ® 00, 01 ® 00} 



4 To show this, it suffices to find four separable states whose projections onto span{o"i<8>ci, 02(8) 02} 
are the four vertices of the square with vertices (|,0), (0, |), (— i,0), and (0, — i); such states are 
4/ ± io"j <8> Cj for z = 1,2. The result then follows from convexity of 52,2- 
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that does detect p. 5 

More generally, at any stage of the data-gathering process, if we have the set of expecta- 
tions {(o"j (g> (Jj) : (i, j) G T}, then p is entangled if there is an entanglement witness in the 
span of {a,, ® (Tj : G T} that detects p (T C {(fc, /) : fc, / G {0, 1,2,3}} \ (0,0)). If the 

experimentalist has access to a computer program that can quickly discover such an entan- 
glement witness (if it exists), then the data-gathering process can be terminated early and no 
more qubits have to be used to detect that p is entangled. The new algorithm is just such a 
program. To see this, note that the projection £2,2 of £2,2 on to spanjcTj ® (Jj : G T} is a 
full-dimensional convex subset of 1R' T ', and the projection p of p onto span{<jj® aj : G T} 
is a point in IR' T ' such that p ^ £2,2 if and only if there is an entanglement witness in the 
span of {(Tj <g) crj : G T} that detects p. Since the new algorithm can be applied to any 
full-dimensional convex set (satisfying certain conditions), we can apply it to £2,2- 

We view the new algorithm as an extra tool that an experimentalist can use to facilitate 
entanglement detection and minimise the number of copies of p that must be measured - 
essentially, trading classical resources for quantum resources. In Section 15.51 I detail how the 
new algorithm is applied to this experimental scenario. 



4.4 New method to solve separation with optimisation 

Now we shed the quantum physical notation, in favour of the simpler and more general 
convex analysis notation. To reconcile the two notations, recall the discussion at the beginning 
of Section T2.3I that relates the trace inner product in M.m,n to the dot product in ^ m2n2 ~ 1 
and explains that Sm,n may be viewed as a convex subset of ^ m2n2 ~ 1 that properly contains 
the origin (which corresponds to the maximally mixed state Im,n)- 

So, assume we have a full-dimensional convex set K C W 1 that properly contains the 
origin. The ultimate goal is to develop a new algorithm for WSEPi n , given an oracle for 
WOPT(J^). Until Chapter we ignore the weakness of the separation and optimisation 
problems, as it obfuscates the main idea; that is, we assume we are solving SSEP(JiT) with an 
oracle for SOPT(AT)- 

Suppose we have an oracle C?soPT(i<r) for the optimisation problem over K such that, given 
a nonzero input vector c, Csopt(e') outputs a point Osopt(k) (c) = k c G K that maximises 
c T x for all x G K. An important step in developing the algorithm is noting that, given 
Osoft{k)-i the search for a separating hyperplane reduces to the search for a region on the 
[n — l)-dimensional surface of the unit hypersphere S n (embedded in R n ) centered at the 
origin. For p ^ K, this region M p is simply {c G S n : c T k c < c T p} (see Figure [O]) . 

5 The idea of searching for an entanglement witness in the span of operators whose expected values 
are known was discovered independently and applied, in a special case, to quantum cryptographic 
protocols in 



CHAPTER 4. REDUCTION TO ENTANGLEMENT WITNESS SEARCH 



51 



The first observation is that, since K properly contains the origin, M p is contained in the 
hemisphere defined by {x : p T x > 0}: 

Fact 13. For all m G M p , m T p > 0. 

Proof. Let m G M p . Then m T p > m T k for all k G K. But the fact that the 0-vector is 
properly contained in K implies that there exists k G K such that m T k > 0. □ 

The second observation, Lemma El is based on the following heuristic, which can be 
pictured in M. 2 and IR 3 . Suppose c, ||c|| = 1, is not in M p (but is reasonably close to M p ) and 
that the oracle returns k c . What is a natural way to modify the vector c, so that it gets closer to 
M p l Intuition dictates moving c away from k c and towards p, that is, add a small component 
of the vector (p — k c ) to c, in order to generate a new guess d = c+ A(p — k c )/\ \p — k c \ |, for 
some A > 0, which we could then give to the oracle again (see Figure . Incidentally, I 
have found that this heuristic actually works: the following little program, in the context of 
the quantum separability problem, always found entan glem ent witnesses for entangled states 
in 112,2, even with very tiny entanglement concurrence [21J| (the value of N' required depends 
on the concurrence): 

c :=p/\\p\\- d:=l; i := 0; 
WHILE (d > AND % < N') DO { 

k c ■= Csopt(x)(c); 
d := c T k c — c T p; 

IF (d < 0) THEN { RETURN C } 

ELSE { c := c+ d(p — k c )/\\p — k c \\; c := c/\\c\\; i := i + 1}}; 

return "INCONCLUSIVE" 

Notice the connection of the above program to the function d p (A) of Section E~21 This program 
can be regarded as an extremely simple heuristic algorithm for the separation problem when 
given an optimisation oracle and promised that p ^ K (of course, it may give inconclusive 
results; in practice, one should set N' as large as is practically feasible). 

Interestingly, the above heuristic can be formalised as follows. If c is not in M p but is 
sufficiently close to M p , then c, p, and k c can be used to define a hemisphere which contains 
M p and whose great circle cuts through c. More precisely: 

Lemma 14. Suppose m G M p , c M p , and let a := (p — k c ) — Proj c (p — k c ). If m T c > 
then m T d > 0. 
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Proof. Note that m T a = m T (p — k c ) — [c T (p — k c )](m T c). The hypotheses of the lemma 
immediately imply that m T (p — k c ) > and c T (p — k c ) < 0. Thus, if m T c > 0, then 
m T a > 0. □ 

The lemma gives a method for reducing the search space after each query to Osopt(k) by 
giving a cutting plane, {x : a T x = 0}, that slices off a portion of the search space. The idea 
is that at each iteration a vector c G S n is chosen that is approximately in the centre of the 
remaining search space. Then c is given to the oracle which returns k c . If c T p > c T k c , then 
a separating hyperplane for p has been found and the algorithm terminates. Otherwise, as 
long as m T c > for all m G M p , the lemma says that the current search space may be sliced 
through its centre c and the origin, and one half discarded. Because the search space is being 
approximately halved at each step, the algorithm quickly either finds a separating hyperplane 
for p or concludes that p G K. 

The above search problem can easily be reduced to an instance of the convex feasibility 
problem: 

Feasibility Problem: Given a convex set K' C R n , either 

(i) find a point k' G K', or 

(ii) assert that K' is empty. 

In this case, the convex set K' is the set K p which is defined as 

K p := [ConvexHull (M p U {0})] \ {0}, (4.2) 

where G R n denotes the origin. The set K p , if not empty, can be viewed as a cone-like object, 
emanating from the origin and cut off by the unit hypersphere (see Figure l4~Tj) . Several well- 
known oracle-polynomial-time algorithms exist for the feasibility problem for K' in the case 
where there is a separation oracle for K 1 that, given a test point y G R n , returns either a 
hyperplane that separates y from K' or asserts that y G K' . The oracle Osoptc^), along with 
Lemma El essentially gives a separation oracle for K p , as long as the test vectors c given 
to Osopt{k) satisfy m T c > for all m G M p . Because of this last requirement, none of the 
existing algorithms can be applied directly. However, the analytic-center algorithm due to 
Atkinson and Vaidya 0] beautifully lends itself to a modification that allows the requirement 
m T c > for all m G M p to be satisfied. I will say more about such algorithms in Section 14.61 
Finding a vector in M p and finding a nonzero point in K p are equivalent for our purpose. 
From now on, we regard the "search space" as the full-dimensional origin-centred hyperball 
B n in R n ; however, to make the analysis more transparent, we will always normalise each test 
point before giving it to the oracle. 
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How could we ensure that all our test vectors c satisfy m T c > for all m G M p l Recall Fact 
fT3*l which says that the set K p is contained in the halfspace {x : p T x > 0}. Let a\ := 
Thus, straight away, the search space is reduced to the hemisphere B n R {x : a\x > 0}. 
The first test vector to give to the oracle Osopt(k) is p/IMI, which clearly has nonnegative 
dot-product with all points in K p and hence all m G M p . By way of induction, assume 
that, at some later stage in the algorithm, the current search space has been reduced to 
P := B n P| n^ =1 {x : afx > bi} by the generation of cutting planes {x : afx = bi}, where the 
<2j, for i = 2, 3, . . . , h, are the normalised a from h — 1 invocations of Lemma ITU Let to be the 
"centre" of P, and suppose that this "centre" is a positive linear combination of the normal 
vectors a iy that is, 

h 

uj = AjOj, where Aj > for alH = 1, 2, . . . , h. (4-3) 

i=l 

Then, by inductive hypothesis, this implies that m T uo > for all m e M p . Thus, c := u;/||a;|| 
is a suitable vector to give to the oracle Osopt(k) and use in Lemma ITU Therefore, it suffices 
to find a definition of "centre u of P" that satisfies (|4.3jl . in order that all our test vectors c 
satisfy m T c > for all m e M p . 

Reducing the separation problem for il' to the convex feasibility problem for some K', 
while using the optimisation oracle for K as a separation oracle for K', is not a new concept 
in convex analysis. But the precise way that Lemma El generates each new cutting plane, 
incorporating the intuitive correction heuristic, does not appear in the literature. This is 
likely because there is a well-known, standard way to carry out such a reduction, which I 
cover in the next section. 



4.5 Connection to standard method 

The standard way to perform the reduction of the last section may be found in the synthesis 
of Lemma 4.4.2 and Theorem 4.2.2 in [jj. 

Definition 21 (Polar of K). The polar K* of a full- dimensional convex set K c M. n that 
contains the origin is defined as 6 

K* := {c G R™ : c T x <1 \/xe K}. (4.4) 

If c G K*, then the plane 7T Cj i = {x : c T x = 1} separates p G M n from K when c T p > 1. Thus, 
6 In some textbooks, e.g. |44|. K* is~called the "1-polar" . 
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the separation problem for p is equivalent to the feasibility problem for Q pj defined as 

Q p := K* D {c : p T c > 1}. (4.5) 

As mentioned in the previous section (and elaborated on in the next section), to solve the 
feasibility problem for any K', it suffices to have a separation routine for K' . Because we 
can easily build a separation routine Cssep(q p ) for Q p out of 0ssep(x*), it suffices to have a 
separation routine Ossep(k*) for K* in order to solve the feasibility problem for Q p . 8 Building 
Cssep(q p ) out of S sep(k*) is done as follows: 

Routine O SSE p (Qp) (y): 
CASE: p T y < 1 

RETURN —p 
ELSE: p T y > 1 

CALL S SEP(K*)(y) 

CASE: 0$sEP(K*)(y) returns separating vector q 

RETURN q 
ELSE: S sEP(K*)(y) asserts y G K* 

RETURN u y G Q" 

It remains to show that the optimisation routine Osopt(k) fo r K gives a separation routine 
Ossep(A'*) for Suppose y is given to Osopt(x), which returns k & K such that y T x < 
?/ T /c =: b for all x G K. If 6 < 1, then Ossep(k*) may assert y G ZT*\ Otherwise, Oq,sep{k*) 
may return /c, because 71^1 (and hence TTk,b) separates y from K*: since fc T y = b > 1, it suffices 
to note that /c T c = c T k < 1 for all c G by the definition of K* and the fact that k G K. 

Fi gure 1431 shows the relationship between the method of Section l4~4l and the above method, 
by illustrating that the set K p (defined in (|4.2|l ) is just the radial projection of Q p onto B n . 
Thus, unsurprisingly, both methods test the feasibility of virtually the same thing. The 
novelty of the method of Section 14.41 lies in the way the cutting planes are generated. 

7 Note that Q is guaranteed not to be empty when p ^ K. For, then, there certainly exists some 
plane ir c ',V separating p from K. But since K contains the origin, b' may be taken to be positive. 
Thus ix c i iy^\ separates p from K. 

8 I slightly abuse the oracular u O" notation, introduced in Section by using it for both truly 
oracular (black-boxed) routines and for other (possibly not completely black-boxed) routines. 
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4.6 Cutting-plane algorithms for convex feasibility for 

K' 

Some remarks about convex feasibility cutting-plane algorithms for K' C M. n , relative to a 
separation oracle Ossep(k'), are in order. All such algorithms have the same basic structure: 

(i) Define a (possibly very large) regular bounded convex set Pq which is guaranteed to 
contain K', such that, for some reasonable definition of "centre", the centre ujq of Pq is 
easily computed. The set Pq is called an outer approximation to K'. Common choices 
for Pq are the origin-centred hyperbox, {x G M. n : — 2 L < Xi < 2 L , 1 < % < n} and the 
origin-centred hyperball, {x : x T x < 2 L } (where 2 L is a trivially large bound). 

(ii) Give the centre uj of the current outer approximation P to Ossep(x')- 

(iii) If Ossepck') asserts "cj e K' v , then HALT. 

(iv) Otherwise, say Cssep(X') returns the hyperplane 7r c b such that K' C {x : c T x < b}. 
Update (shrink) the outer approximation P := P fl {x : c T x < b'} for some b' > b. 
Possibly perform other computations to further update P. Check stopping conditions; 
if they are met, then HALT. Otherwise, go to step (ii). 

The difficulty with such algorithms is knowing when to halt in step (iv). Generally, the stop- 
ping conditions are related to the size of the current outer approximation. Because it is always 
an approximate (weak) feasibility problem that is solved, the associated accuracy parameter 
S can be exploited to get a "lower bound" V on the "size" of K', with the understanding that 
if K' is smaller than this bound, then the algorithm can correctly assert that S(K', —5) is 
empty. Thus the algorithm stops in step (iv) when the current outer approximation is smaller 
than V. 

The cutting-plane algorithm is called (oracle-) polynomial-time if it runs in time 
0(poly(n, log(l/<5))) with unit cost for the oracle. It is called (oracle-) fully polynomial if 
it runs in time O (poly (n, 1/5)). This thesis is concerned primarily with polynomial-time 
cutting-plane algorithms. 

Using the standard cut-generation rule, there are a number of polynomial-time convex 
feasibility algorithms that can be applied (see j^l for a discussion of all of them). The 
three most important are the ellipsoid method, the volumetric centre method, and the analytic 
centre method. The ellipsoid method has Pq = {x : x T x < 2 L } and is the only one which 
requires "further update" of the outer approximation P in step (iv) after a cut has been 
made - a new minimal- volume ellipse is drawn around P := P fl {x : c T x < b'}. The 
ellipsoid method, unfortunately, suffers badly from gigantic precision requirements, making it 
unusable in practice. The volumetric centre and analytic centre algorithms are more efficient 



CHAPTER 4. REDUCTION TO ENTANGLEMENT WITNESS SEARCH 



56 




than the ellipsoid algorithm and are very similar to each other in complexity and precision 
requirements, with the analytic centre algorithm having some supposed practical advantages. 9 
The cutting plane {x : c T x = b'} requires further definition: 

deep- cut 

then the above is a ^ central-cut } algorithm. (4.6) 

shallow-cut 

Intuitively, deep-cut algorithms should be fastest. Ironically, though, except for the case 
of ellipsoidal algorithms (which are practically inefficient), the algorithms that are prov- 
ably polynomial-time are central- or even shallow-cut algorithms. For instance, even though 
@ssep(k*), built on Osopt{k), gives deep cuts ir^i, it is not known how to utilise the deep cuts 
to get a polynomial-time algorithm using analytic or volumetric centers. Note that the new 
cut-generation method in Section 03] is capable only of giving central cuts; but this does not, 
a priori, put it at any disadvantage (relative to the standard cut-generation method) with 
regard to polynomial-time analytic or volumetric centre algorithms. We will see in Chapter 
that this new cut-generation rule indeed yields a polynomial-time algorithm. 



4.7 A new quantum separability algorithm 

The algorithm in Chapter 03 which is based on analytic centres, gives a new method for 
solving the quantum separability problem by solving WSEP In (<SM,7v)- As we will see, the 
number of arithmetic operations required by the algorithm is 

0((T + M 6 iV 6 log(l/5))M 2 iV 2 log 2 (M 2 iV7<5)), (4.7) 



where T is the cost of one call to the WOPT(iS>a/ 5 at) routine. 

Now consider the complexity of computing an instance (A, e) of WOPT(Sm,n)- The only 
way to get an upper bound on this complexity is to assume the most naive way to carry out 
this computation, which is to one-by-one calculate tr(Ao~) for each of the pure separable states 
a to a sufficiently high precision, and then return the a that produced the largest value of 
tr(Ax). 

I use the same framework and notation of Section ElH.il Suppose a = aoft®f3ffl maximises 

tr(Ax), and, as before, let a and j3 be the p'-bit truncations of a and f3. Let 7 := aofi (g> 

(3(3^ — aa* <S> $$. The real coordinates of the entries of [7] have absolute value no greater 

than 2~( p '~ 6 \ Since we give to the WOPT(5a/,aO routine an A such that ||v4||2 = 1, we have 

9 To date, no one has implemented a polynomial-time cutting plane algorithm. For an implemen- 
tation of a fully polynomial algorithm, see http://ecolu-info.unige.ch/logilab. 
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A||i < \fMN\\A\\ 2 = VMN which, since A is normal, is equivalent to £\. |Ay| < y/MN 
22j, where Ay are the entries of [A]. This gives a bound of | Ay | < y/MN. It follows that 

tr(A(cm t ® - tr(A(aa f ® /p)) = tr(A 7 ) (4.8) 

< M 2 - 5 iV 2 - 5 2-( p '- 7 ). (4.9) 

We set p' such that M 2 - 5 iV 2 - 5 2~ (p '~ 7 ) < e, which gives 

p'>log 2 +7. (4.10) 



This gives 



10 



T - fi p /poly(M, AT, 1/5) (4.11) 

/2 7 M 2 - 5 iV 2 - 5 \ 2(M+JV) 
< poly(M, N , 1/5). (4.12) 



In practice, however, it need not be so bad. We can formulate the optimisation problem 
as the (constrained or unconstrained) maximisation of a real function /(cr) := tr(A<r) of real 
variables parametrising a, and then apply continuous optimization methods to /. Denote 
by /* the global maximum of /. As the global optimisation algorithm proceeds, it may give 
progressively better lower and upper bounds on f*. 11 Call these bounds / and /, respectively. 
A key advantage of the algorithm is that, during any computation of O(A), the search for /* 
may be halted early when either (?) tr(Ap) < /, in which case Lemma ITU can be invoked to 
generate a new cutting plane, or (m) / < tr (Ap), in which case the algorithm has found an 
entanglement witness for p. Note that lower bounds / can be generated very quickly using local 
optimisation routines seeded at random points in the domain of /. Thus, the algorithm's run 
time may be significantly shorter than the worst-case complexity of WOPT(Sm,n) predicts. 



4.8 Complexity comparison of algorithms 

All of the algorithms considered solve the weak membership problem for Sm,n with accu- 
racy parameter 5. How does the new separability algorithm of the previous section compare 
to the others? 

Recall the reasonable presumption that the numerical integration in Zapatrin's algorithm 

10 When looping through all the elements a and (3 in practice, we would skip all a and (3 whose 
norms are greater than 1, so as not to report an inflated global maximum. 

11 Upper bounds on /* are given by Hansen's interval-analysis global optimisation algorithm [jR 
Iffi^ . This algorithm calculates bounds on the derivative of / (over a bounded domain) in order to 
compute upper bounds on /*. 



CHAPTER 4. REDUCTION TO ENTANGLEMENT WITNESS SEARCH 



( Section I3.3.4|) is far more computationally intensive than the global minimisation of the new 
algorithm. Recall also that Perez- Garcia' s algorithm (Section I3.3.3j) is not clearly related to 
the weak membership problem, barring new results about the || ■ || 7 -norm and robustness of 
entanglement. 

The following table summarises the dominating factors (that are at least factorial in M 
or N) 12 in the run-times of the new algorithm and the algorithms of Sect ions 13 . 3 . 1 1 and 13 . 3 . 21 



Search for separable decomposition ( Section 13. 3. lj) 


(MN/5)°( m3n2+m2n ^ 


Bounded search for symmetric extensions (Section 13. 3. 2|) 


(M/5)°( M ) 


Search for entanglement witness (Section 14.7)1 


(MN/5)°( M+N ^ 



Of the three algorithms in the table, the search for separable decompositions is, as expected, 
the most complex. 

A few remarks are in order regarding the new algorithm and the bounded search for 
symmetric extensions. Right away, we can see that if M is a constant, then the bounded 
search for symmetric extensions has a much lower complexity. Note, however, that if M — N, 
then the two complexities, as summarised in the table, become the same. As a related side 
point, note that Gurvits has actually shown WMEM(Sm,n)m,n to be NP-hard when 
M < N < M(M - l)/2; it is an open problem as to whether, say, WMEM(S 2 ,n)n is NP- 
hard. So, if we want to be absolutely sure we are solving a hard problem, we can restrict to the 
case where M = N. In this case, it is easy to check that the detailed complexity estimates 
given previously indicate that the new algorithm has a better complexity, even when we 
take into account that the bounded search for symmetric extensions uses the trace norm as 
opposed to the Euclidean norm. Recall that the bounded search for symmetric extensions has 
complexity on the order of di , where we can invoke the lower bound ds h > (16M 11 /5 2 ) M 1 
from equation 1EQ3I) to get d% k > 2 16A/ " 4 M 44M /5 8M ~ 8 . But the algorithm gets a complexity 
reduction for solving the weak membership problem with respect to the trace distance instead 
of the Euclidean distance. This reduction corresponds to substituting M5 for 5 in the above 
lower bound, which gives the best known lower bound on the complexity of the bounded 
search for symmetric extensions of 

/, \ SM-8 

2 16A/-4 M 36M+8 / M _ ( 413 ) 

The dominant factor in the run-time estimate of the new algorithm, which appears in (|4.12|) . 
is {2 1 M 2 - b N 2 - b / e) 2{ ~ M+N \ In Chapter we will see that e := 5/5. Making this substitution 
and setting N := M gives an upper bound (ignoring polynomial factors) on the run time of 

Recall Stirling's approximation: n n ~ n 
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the new algorithm of 

2 40Af M 20A/ , - ] 



x 4M 



The factors in (|4.1H|) and (|4.14j) that are at least factorial in M are, respectively, M 36M+8 and 
M 20M , the former being larger. As well, the dependence on 5 in (I4.13J1 is worse than that in 
(|4.14jl . Therefore, the new algorithm has the smaller run-time estimate when M = N. 
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Figure 4.1: The sets M p and K p in M 2 . Pictured in heavy outline is a set K in M 2 , where 
K := conv{(0,l), (-1,1), (-1,0), (1,-2)}. A point p = (-7/8,-3/4) is shown as a heavy dot. The 
unit circle is drawn in a dashed line. The set M p is the arc of the unit circle that the shaded pie-slice 
subtends; the set K p is the shaded pie-slice. In two dimensions, the set M p (K p ) is easy to construct. 
This construction has been illustrated: draw the two distinct lines through p that are tangent to K; 
the lines that determine the pie-slice are the two straight lines that are perpendicular to the lines 
through p. The idea behind this geometrical construction easily generalises to M 3 . 
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Figure 4.2: Illustration of intuitive heuristic behind Lemma ITU Continuing from Figure HTT1 the 
unit vector c is a test vector that is close to M p but not in M p . Evidently, adding a component of 
(p — k c ) to c moves it closer to M p . 



CHAPTER 4. REDUCTION TO ENTANGLEMENT WITNESS SEARCH 



62 





Figure 4.3: The upper picture is a set K in K 2 , where K := conv{(0, 1), (-1, 1), (-1, 0), (1, -2)}. 
A point p = (—7/8, —3/4) is shown. The polar K* of K is shown in heavy outline in the lower 
picture; K* = conv{(0, 1), (—1,0), (—1, —1), (3, 1)}. The set Q p is the shaded polytope, bounded by 
the long-dashed plane {c : p T c =1}. The set K p is the shaded pie-slice and is the radial projection 
of Qp onto the origin-centred unit ball (whose boundary is shown as a short-dashed circle). The 
particular K and K* are taken from 



Chapter 5 



New polynomial-time reduction from 
WSEP to WOPT 



As promised, I now show that the cut-generation rule of Section 14. 4| which is based 
on an intuitive heuristic, yields an oracle-polynomial-time algorithm for the in-biased weak 
separation problem for a convex set K <Z M. n relative to an oracle for the weak optimization 
problem for K\ we only assume that K contains a ball of finite radius centered at a known point 
Co and is contained in a ball of finite radius R. The algorithm uses 0(poly(n, \og(R/5))) calls 
to the weak optimisation oracle, where S is the accuracy parameter that appears in Definition 
[TTfl For the remainder of this thesis, O will denote the oracle for the weak optimisation 
problem for K. One simplifying assumption that we will carry through this chapter, without 
loss of generality, is that Co is the origin. This new algorithm is based on the analytic centre 
cutting-plane algorithm of Atkinson and Vaidya (oil ]. 

Continuing the discussion in the previous chapter, Section I5~T1 gives the main idea behind 
the new algorithm. Section 15.21 presents the algorithm in terms of parameters that will be 
given in section 15.31 which contains the proof of correctness of the algorithm. Section 15.41 
discusses complexity and relates the algorithm to the standard cut-generation method of 
Section l4~Hl Section IB~5l gives the algorithm's parameters for the specific case of the quantum 
separability problem. 

5.1 The Main Idea of the Algorithm 

The general idea of the algorithm is as follows. Let P be the current outer approximation 
P := B n f}Pii =1 {x : afx > bi}, as described in the second-last paragraph of Section l4~4l 
Recall that we need a definition of "centre uj of P" that satisfies (J4.3)) . Define the analytic 
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centre u of P as the unique minimiser of the real convex function 

h 

F(x) := — ^ \og(aJx — bi) — log(l — x T x). (5.1) 

i=l 

The relation VF(u) = gives 

"=^^^sr^-- (5 - 2) 

i=i 1 

which shows that u, defined as the analytic centre of P, indeed satisfies (14. 3j) . 

The algorithm stops when the current outer approximation becomes either too small 
(volume-wise) or too thin to contain K p . For this, a lower bound r > on the radius of 
the largest ball contained in K p is needed. By exploiting the accuracy parameter 5 of the 
weak separability problem, such an r exists and is derived in section 15.3.51 

The actual algorithm is not as straightforward. For instance, each time a new cutting 
plane is added, it is shifted by some amount (bi < 0) so as to keep the analytic centre of the 
old P in the new P. As well, cutting planes are occasionally discarded so that h does not 
exceed some prespecified number. This shifting and discarding of hyperplanes is done exactly 
as in [9^] • To facilitate comparison, we use notation that corresponds to the notation used in 
0. 



5.2 The Algorithm 

Following 0|, the algorithm utilises three types of quantities (0^(2:), K^a^bi), and /ij(^)), 
whose significance we now briefly explain. Suppose that P = B n f] r\f =1 {x : afx > bi} is the 
current search space at some stage during the algorithm; that is, suppose a total of h cutting 
planes have been generated. Denote the hyperplane {x : ajx — bi = 0} by the ordered pair 
(cii,bi). Recall that for any positive definite matrix A, one can define the ellipsoid E(A, z,r) 
as 

E(A, z, r) := {x E R n : {x - z) T A(x - z) < r 2 }. (5.3) 

When A = V 2 F(z), we refer to E(A, z,r) as the Hessian ellipsoid. 

We mentioned that one of the stopping conditions is that the volume of P gets too small 
to contain K p . Later we will see that the volume of P can be related to the determinant of 
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V 2 F(u>), where u is the analytic center of P. Define the quantities 

<x) := (5.4) 

for x E P. The denominator is the square of the distance from x to the hyperplane (a*, ft*). 
The numerator is the square of the radius of the Hessian ellipsoid £'(V 2 -F(a;), 1) in the 
direction of a\. In Lemma Wl\ we will see that E(V 2 F(x), x, 1) C P. The smaller the quantity 
the further away the hyperplane (a*, is from the ellipsoid _E(V 2 -F(x), x, 1). If 2 is 
an approximate analytic center of P, then a sufficiently small value of <Ji(z) will indicate 
that (a,i, bi) has a small effect on det(V 2 -F(,z)) and so it can be discarded because it does not 
sufficiently affect the volume of P. 

Computing <Ji(z) values is relatively computationally expensive, so there is a simple test 
that can trigger a check of &i{z). When the hyperplane (a*, bi) is first introduced, the quantity 
K(a,i, bi) is set to afz — bi, which is the distance from (a$, bi) to the approximate analytic center 
z of P. If, at some later step, we find that the distance from the current approximate analytic 
center z to (ai, 6») has doubled, then the quantity 0i(z) is computed and tested. We denote 
the ratio of the current distance to the original distance by fii(z) := (ajz — bj) / K{a^bi) . If 
(Ti(z) is not sufficiently small, then n(ai,bi) is reset to the current distance. 

To compute approximate analytic centers, we use the Newton method. A useful function 
that measures the quality of the approximation is 

X(x) := VVF(i) T (V 2 F(x))" 1 VF(x). (5.5) 

As well, define the function q\ := 1 — (1 — 3 A) 1 ' 3 for A G R, and the function ^(x) := (A(x)) 2 . 

The subscripts 'd' and 'a' in the algorithm mean 'after a hyperplane is discarded' or 'after 
a hyperplane is added 1 , respectively. 

The algorithm is presented in terms of undefined constants (all variables with the subscript 
"0" , plus v) and parameters (r,u,8). For a list of the definitions of the parameters and suitable 
values of the constants, the reader may consult subsection 15.3.51 

The stopping conditions in the following algorithm are required for the proof of polynomial- 
time convergence, but they are not the best conditions to use in practice. In subsection 15.3.61 
we give tighter stopping conditions that depend more heavily on z and V 2 F(z). 

The algorithm for the in-biased weak separation problem for K, relative to an oracle for 
the weak optimization problem for K, is as follows: 

BEGIN 

initialise! 

ai :=p/|b|| 

P ■= B n n {x : ajx > 0} 
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z := Oi/v3 
K(ai,&i) := l/\/3} 
DO{ 

IF maxj ufa) > 2 then 
Case 1: 

IF there is an index j such that fij(z) > 2 and <Jj{z) < cxq then 
Subcase 1.1: 

Discard (a^, fey) from the set of hyperplanes defining P, yielding a new 
region P d ; P ncw := P d . 

Starting at Xq := 2, iterate Newton steps until both 

X(xi) < po an d q\( Xi ) < Y+J^T^ ^° a new a PP rox i ma ti on : = %i 
to the new analytic center u; d of P d ; -^new '■= %d- 

ELSE 

Subcase 1.2: 

Let (cij,bj) be any hyperplane such that p.,(z) > 2. 
Reset k;(oj, 6j) := ajz — bj. 

ENDIF 
ELSE 

Case 2: 

Call weak optimization oracle on c := z/||z|| with e := 5/5. 
IF oracle outputs k c E K such that c T p > c T k c + 5/5 then 

RETURN C. 
ENDIF 

a := (p — fc c ) — c T (p — fc c )c; a := a/\\a\\. 

Compute < such that 7 2 := {a T [V 2 F{z)]- 1 a)/{a T z - /5) 2 = 7o 2 . 
Add (a,/?) to the set of hyperplanes defining P, that is, 
set P a := P n {x : a T x > /?}; P ncw := P a . 
Starting at x$ := 2;, iterate Newton steps until both 
X(xi) < po and q\( Xi ) < Y^^f^ to a new approximation z a := Xi 
to the new analytic center u; a of P a ; z ncw := z & . 
Set At(a, /?) := a T z a — [3. 
ENDIF 

P • Pncw ) ^ • ^new • J 

until{ 

Stopping Condition 1: h> unu(n,5), OR 
Stopping Condition 2: 2r > l^M^hR (3^ + 4)} 
ENDDO 

RETURN "p G 5 (A', 5)" 
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END 



5.3 Proof of Correctness of the Algorithm 

To prove that the algorithm is correct, we need to deal with the fact that the algorithm 
is run on a computer with fixed precision. If the volume and width of K p are to be lower- 
bounded, then clearly we need to exploit the weakness of the separability problem; that is, 
we only need to find a separating hyperplane for p when p is outside of S(K, 5). This would 
give a lower bound on the volume and width of K p in terms of n, R, and S. We present the 
convergence proofs next, assuming that we have a lower bound r on the maximum radius of 
a ball contained in K p : 

r < sup{r' e R + : B(x, r') C K p }, (5.6) 

X 

where B(x,r) := {y G R n : \\y — x\\ < r} and R + denotes the positive real numbers. In 
subsection 15.3. 2| we will derive a suitable r = r(n,R,S). The volume of a hypersphere of 
radius r in R n is lower-bounded by (r/n) n Qj. Thus, inequality ()5.6|) gives 

volume(^ p ) > (-J . (5.7) 

We note here expressions for the gradient V-F(x) and Hessian V 2 F(x) of the function 
F(x) as defined in (|5.1jl : 



2x 



ajx — hi 1 — x T x 

i=i 1 

t—?2 7-1/ \ \ — N diC^i* 4xX^ 21 

= ^ - b i) 2 + (1 - + 1 - x T x' 

where 7 denotes the identity operator. 

The full proof will be given in stages. In subsection 15.3. 1\ we will present the results 
required to prove that the algorithm works with the assumptions that the cutting planes 
generated do not cut into the set K p and that sufficiently good approximations of the analytic 
centers are at hand. The proofs (mostly appearing in the Appendix) will be left in terms 
of parameters including various constants and the inner radius r. In subsection 15.3.21 we 
show that such correct cutting planes can be generated. In subsection I5.3.3| we derive a 
suitable value for r. In subsection 15.3.41 we describe the Newton method used to calculate 
approximate analytic centers and show that the number of required Newton iterations is small. 
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In subsection 15.3.51 we give concrete values for all constants. 

Before diving into the tough stuff, I show that the initialisation of the analytic centre 
z := ai/y/3 is correct. I actually prove something slightly more general, which will come up 
in the discussion in Section I5~H 

Fact 15. For \\ai\\ = 1, the analytic centre uj of {x : x T x < R*} fl {x : a\x — s > 0}, for 

s > 0, is 

s + Vs 2 + 3R* 

uj = - at. (5.8) 

Proof. The equation VF(uj) = (for the barrier of radius R*) gives 

2uj ai 



R* — lu t lu afuj 



(5.9) 



This implies that uj = \a\ for some A > 0. Making this substitution and solving for A gives 
3A 2 — 2sA — R* = 0, which gives the required result. □ 

5.3.1 Convergence 

The new algorithm for the feasibility problem for K p differs from the one in 0] in two 
essential ways: 

(i) I do not assume that we have an unrestricted, unweakened separation oracle for K p . 
Rather, we assume that we have a weakened separation oracle (built from the weak 
optimization oracle for K and Lemma EJ) which is restricted in that it can only handle 
queries c satisfying m T c > for all m e K p . 

(ii) To accommodate the above restriction, I use the 0-centered unit hyperball B n containing 
K p as the initial search space instead of a 0-centered hyperbox {x G R n : — 2 L < Xi < 
2 L , 1 < i < n}. 

The second item above means that the current search space P is never a polytope. Con- 
sequently, most of the lemmas of 0| that are properties of the function F(x) cannot be 
used without modification. Luckily, though, the function F(x) is a self- concordant functional 
(94! which has all the analogous properties necessary to make the proofs of [9lJ work for 
our algorithm. I present these fundamental lemmas below; the corresponding label num- 
ber in (9^1 will appear in parentheses after our label number. In the following, assume 
P = B n f] fl^ =1 {x : ajx > bi} and F(x) := — £) i=1 l°g( a f x ~ h) — log(l — x T x) for h > 0, so 
that the interior of P is the domain of F. As always, uj denotes the analytic center (unique 
minimiser) of P (F(x)). 
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Lemma 16 (Line (2) in j9l| ^. Let A be positive definite. For any fixed vector w in R n ; 



max w T (x — z) = rVw T A l w. 

x£E(A,z,r) 



Proof. See f° r example. □ 
Lemma 17 (Lemma 1 in |9l|). For every z G P, E(V 2 F(z), z, 1) C P. 
Proof. Follows from definition of self-concordance; see or 0]. □ 
Lemma 18 (Lemma 3 in j9l|). If a < 1 and y G E{\7 2 F(z), z, a), then 

(1 - a) 2 e^ 2 F{z)i < eV 2 F(y)£ < (1 - a)- 2 £ T V 2 F(^ (5.10) 

for all£e R" 

Proof. Follows from definition of self-concordance; see or 0]. □ 



Lemma 19 (Corollary 4 in 91]). Suppose A and B are positive definite n x n matrices 
such that i T Ai > 9£ T B£ for some 6 > and for all (eR". Then ^A"^ < e^^B^ 1 ^ for 
all£e R". 

Proof. See proof of Lemma 2 in 

Recall the second-degree Taylor expansion of F(y) about z G R n : 

F(y) - F(z) = VF(z) T (y - z) + ^{y - z) T V 2 F(z)(y - z) + Error. (5.11) 

Lemma 20 (Lemma 5 in j9l| ). // y G E{y 2 F(z), z, a) where a < 1, then the error in 
using the second-degree Taylor polynomial constructed about z to approximate F(y) satisfies 
lErrorl < „ /1 Q;3 , . 

I I — 3(l-a) 

Proof. See proof of Theorem 2.2.2 in 0|. □ 

Lemma 21 (Lemma 6 in 0). If X(z) < |, i/ien F(z) - F(u) < ±q 2 x{z) ^^. 

Proof. See proof of Theorem 2.2.2 (Hi) (line 2.2.15) in [9^. □ 

Lemma 22 (Lemma 7 in Let a := ^/{ uj — z ) T V 2 F{z)(u - z). If X(z) < |, then 

ol < q\( z) ■ 

Proof. See proof of Theorem 2.2.2 (Hi) (line 2.2.17) in [2^. □ 

The next lemma gives a Hessian ellipsoid centered at the analytic center u> which contains 
the current search space P. The volume of the ellipsoid gives an upper bound on the volume 
of P which is useful for knowing when P is too small to contain K p . 
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Lemma 23 (Lemma 9 in |9l|). If h > 31 then P C E(V 2 F(u), u, y/Uh). 



Proof. Since u is the unique minimiser of F(x), we have 



T = VF u f = Y rp 1 , + =- ^ =- = V — 



Therefore, 



a; — 6, 1 — u; T co> 1 — c<j t c<j ^— ' a; — 
i=l 4 1 i=l 1 1 



\- ajuj - b si I ^ a-' \ ^ 6, : 



. a- 'a; - 6» \ ^ aj'cu - h I ^ afu - h 

i=i 1 \«=i 1 / i=i 1 



^ E^ 



1 — uj t uj ' — 6, 

2cj t u; ^ — af a; + af a; — 6j afx 2u T x 

1 — c<j t ci; ^— ' afcj — 6j ^— ' afa; — 6; 1 — a> T u; 

i=l 1 i=l 1 



T x=0 

2 

af (x — cj) + af — 6, 



af (x — c<j) + afa; — hi 1uj t {x — to) 

aju — hi 1 — uFoj 

i=i 1 



h rp. \ | T 



E 

vi=l 

/ 2uj t {x — u>)\ 2 ( clJ(x — u) + ajuo — bj\ u T (x — u>) 

\ 1-lu t lu J \~( afu-bi J 1-lv t uj 



Now, for x G P, we have that afx — b{ > and so 

E 



2 

af (x — u;) + aju — bi\ ^ ^ aj (x — uj) + afcu — &j 
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Therefore, 

(af(x — u) + (afuj - bi)f 



" 2 >£ 



(afw - h)' 

' 2uj T {x — to) \ 2 ^ ( af(x — to) + af a; — b{ \ uj t {x — uj) 



1 - uj t uj ) ajuj-b, ) \-uj t uj 



^ (af(x-uj)) 2 y^ af(x-u) 

2^ („T.,_ h .\2 +Z Z^ „T,.,_h. + 



/ 2uj t (x — uj) \ 2 ( a[(x — uj) + a; — fej \ u; T (x — c<j) 
+ i_,,,t,,. ~ 4 ^t7 



1 — uj t uj J \ ^— ' a; a; — &; 11 — uj t uj 



^(af(x-cu)) 2 , /2cu T (x-cu)V , /2||x-cu|| 2 2||x-a;| 



f (af a; — bi) 2 V 1 — c<j t u; / \ 1- u; T u; 1 — cj T a; 



=o 



2 gfOg ~ ^) , ^ _ 4 [ af(x-uj) + afuJ-bA uj t (x - uj) 
^ afuj - bi I A' a?u; - 6,- / 1 - cj t cj 



8=1 

(x - u,) T V 2 F(u;)(x - w) + h - ^ - ' ^ 

1 — UJ 1 UJ 

+2 af(x ~ u) _ 4: (y^ af(x - uj) + afuj -b t \ uj t (x - uj) 

^ ajuj -bi \ aju -bi J 1 - uj t uj 

i=i 1 \ i=i 1 / 

/ n t^9^/ n/ x 2(x — uj) t (x — uj) 
{x - uj) t V 2 F{uj){x -uj) + h- V - ; K - 

1 — UJ 1 UJ 

4uj t (x — uj) ( 2uj t {x — uj) \ uj t (x — uj) 



1 — uj t uj V 1 — uj t uj ) 1 — UJ T UJ 

2x T (x — uj) 2uj t (x — uj) 



x - uj) T V 2 F(uj)(x- uj) + h 



1 — UJ T UJ 1 — UJ T UJ 



4cu T (x — uj) ( 2uj T {x — uj) \ uj T {x — uj) 
\ : t?, 4 I - — h h 



1 — UJ T UJ \ 1 — UJ T UJ J 1 — UJ T UJ 

(x - uj) t V 2 F(uj)(x - uj) + h 



■(2/1-3) 



1 — UJ 1 UJ 

2oj t (x — uj) „ / u; T (:r — a;) x 



1 — cj t u; \ 1 — uj t uj 

Let s := ^'^ and t := uj t (x — cu). Thus, x T u; = uj t x — t + uj t uj, uj t uj = —, and \t\ < 2 
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since x, uj G B n . All this gives 



h 2 > 


(x 


— UJ 


t V 2 F(uj) 


{x — 


Uj 1 


+ h 


\ 


(■>' 


— ^ 


V r \UJ ) 


(x - 


UJ t 






(x 


— uj) 


t V 2 F(uj) 


(x — 


uj) 


+ h 




(x 


— uj) 


t V 2 F(uj) 


(x — 


uj) 


+ h 




(x 


— uj) 


t V 2 F(uj) 


(x — 


uj) 


+ h 


> 


(x 


— uj) 


t V 2 F(uj) 


(x — 


uj) 


+ h 



(h - 2)Ast - 8s 2 t 2 - 2 
+ h-(h- 2)8s - 32s 2 - 2. 

Because in the algorithm bi < for all i, equation ()5.12|) gives 



,2.2 



,2+2 



,2+2 



2s- 



2.2 



(2h - A)2st - 8s 2 t 



(5.14) 



1 h + 2 
< . 



1 — UJ T UJ 

Plugging in this bound gives 

(x - uj) t V 2 F(uj)(x -uj) < Uh 2 + 31/i + 18. 
The right side of the above inequality is less than lAh 2 if h > 31. 



(5.15) 



(5.16) 
□ 



The next lemma is required for the stopping condition based on P's becoming too thin to 
contain K p . Define the width of P in the direction of as width(aj) := max x>3/g p af (x — y). 

Lemma 24 (Lemma 10 in (9lJ). For every i, width(aj) < (afuj — 6j)(3/i + 4). As well, for 
every i, width(aj) < (afuj — bi){h + 4/(1 — ||c<j|| 2 )). 

Proof. From equation 15.131 it follows that 

h 



af x — bi 2uj (x — uj) 



i=l 



a\ uj — bi 



W T UJ 



for all x G P. Since for every index j there exists some x^ in P satisfying width(aj) < ajx^—bj, 
we have 

width(a,) ^ ajx j - bj 



ajuj — bj 



ajuj — bj 



< 



a 



ajxi — h 



i=l 



= h + 
<h + 



afuj - bi 
2uj T (x j - uj) 

1 — UJ T UJ 

4 



1 — UJ T UJ ' 
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where the last inequality follows from x\ uj G B n . This proves the second statement of the 
lemma. Employing the bound 1 _] j t uj < as in the proof of Lemma ESI proves the first 
statement. □ 

Now we state the main results needed to derive the stopping conditions of the algorithm. 
At each iteration, we assume that we have an approximate analytic center z that satisfies 



A (z) = y/W(z) < p < po < ~. In section 15.3.41 we will explain how to achieve this approxi- 
mation using Newton iterates. Lemma 1221 gives 

ujeE(V 2 F(z),z,q p ). (5.17) 

In what follows, we will set ( := q p and (q := q Po - We also assume the approximation satisfies 
C, < Co < 1- We regard p and £ as varying parameters with respective tight upper bounds po 
and Co; which are constants, to be selected after the analysis is complete. As such, our p and 
( correspond to those in |9l| . 

The structure of the argument is exactly as in |9lJ mutatis mutandis. Hence, the proofs 
are in the appendix; they are included for completeness and to provide justification for the 
constants we use in the algorithm, since our constants differ from those in 

Derivation of Stopping Condition 1: Volume Argument 

Lemma 25 (Lemma 17 in j9l| L Let z be an approximation to lu such thatuj G E{y 2 F(z), z, (). 
Suppose the hyperplane (a, ft) is added in Case 2 with 7q = 7 2 = - ~ - Then, 



\a T (z — uj) 



a 1 uj — p 



(c) VJu) < f := 7 2 



l V/ l 



i-Ct/ Vi - C 



2 



(5.18) 

With £ suitably small enough that 7 < |, we have by Lemma 1221 that 

UJ a eE(V 2 F a (u),UJ,q^. (5.19) 

Lemma 26 (Lemma 18 in |9l| |). Suppose a hyperplane is added in Case 2, and the analytic 
center moves from uj to uj & . Let 7 = a/ a T (V 2 F(z))~ 1 a/(a T z — f3) 2 . If '7 < |, then 
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Theorem 27 (Approximation version of Theorem 13 in 91]). Suppose that maxK^/, Hi(z 
2 at the beginning of an iteration, i.e. Case 2 is about to occur. If the current search space P 
is determined by h hyperplanes (in addition to the unit hypersphere) , then 

det(V 2 F(z)) > 2" n (l + C 2 ) h = 2 {Xo ^ 1+C2))h ~ n , (5.20) 

for some positive constant C 2 which depends on the parameters cr and 7 of the algorithm and 
the "minimal goodness" Co of the approximation to the analytic centers. This can be improved 
to 

det(V 2 F(z)) > 2~ n (2.5)(l + C 2 ) h ~ 1 . (5.21) 



Lemma 28 (Lemma 19 in (9l|). For the approximate analytic center z withcu £ E(V 2 F(z), z, 
we have 



P C E(V 2 F(z),z,$), 



where 



*■= ^ 2 ((T^ + C2 )' ifh>3h 

From here on we will assume that h > 31, that is, that the minimum number of total 
hyperplanes will be 31. We will also assume that £ < 1/16, in which case, & in the above 
lemma satisfies d < 6h. 

Theorem 29 (Approximation version of Theorem 14 in j9l| ^. There exists a constant 
v, independent of h, n, R, and 5, and there exists a function u(n,6) E 0(poly(n, log(4))) 
such that if h = vnu(n,5), then the volume of K p is sufficiently small so as to assert that 

P eS(K,6). 

This completes the derivation of Stopping Condition 1. 
Derivation of Stopping Condition 2: Width Argument 



Lemma 30 (Lemma 16 in (9l|). Let ( < 1. If u e E(V 2 F(z), z,() , then for all i, 
l<i<h, 

, \ o-j(z) 
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Define 



N(x) 



In ( a \ X M - ln(l - x T x) = F(x) + V ln( K (a i5 h)). (5.23) 



Note that N(x) — N(y) = F(x) — F(y) in any given iteration. 

Theorem 31 (Approximation version of Theorem 11 in |9l| ). There exists a positive 
constant 9, independent of h, n, R, and 5, such that after i iterations of the algorithm, 
N(u) > 9i. The constant 9 will depend on the parameters of the algorithm. 

Theorem 32 (Approximation version of Theorem 15 in |9l| ). If the algorithm does not 
first find a separating hyperplane or halt by Stopping Condition 1, then, within 0(nu \og(nuR/ 5)) 
iterations, Stopping Condition 2 must be met. If Stopping Condition 2 is met, then the set 
K p is negligibly small and the algorithm may return "p G S(K, 5) ". 

This completes the derivation of Stopping Condition 2. 



5.3.2 Producing Good Cutting Planes 

Suppose that M p is large enough that the algorithm must return an element of M p . Up 
until this point, we have assumed that the cutting planes generated by the algorithm do not 
accidentally slice off any portion of K p , that is, that m T ai > for all % = 1, . . . , h and for 
all m G M p . With finite-precision computations, this condition is not sufficient. In order to 
combat the effects of round-off, we would ideally require something stronger: for all m G M p , 

m 7 Oj > 5, for allz = 1, . . . , h, (5-24) 

for some 8 > 0. As it stands, this requirement is tricky to achieve. However, if we merely 
insist that ()5.24j) holds for all m in the smaller set 

M'p := {c G S n : c T k + 5' < c T p VA; G S(K, e)}, (5.25) 

for some 5' > 0, then we can ensure that the cutting planes do not accidentally slice off any 
portion of K' p := [ConvexHull (M' p U {0})] \ 0. The size of K'p is still large enough to give the 
asymptotic behaviour we desire from our algorithm. 

Lemma 33. Let P be the current search space, defined by h cutting planes {x : ajx = bi}, 
where ||aj|| = 1, for i = l,...,h. Assume that z is an approximate analytic center of P 
satisfying \{z) < p such that ( := q p < 1. Assume further that 



1 + 5 V2' 
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Let c := z/\\z\\. If m T ai > S for all i = 1, . . . , h, then m T c > |. 



2 • 

h 



Proof. Equation (J5.2|) says that d := u)/\\u\ | can be written as Yli=i Vi a i with rji > for all i. 
Thus, 



m T uj T 



c = y^^(m T a^) > 5 rjj > 5, 



— iT = m 



because 



1 = d T d = J2v l (ajd) < J2vi\ajc'\ < J^Vi- 

i=l i=l i=l 

Since uj G E(V 2 F(z), z, (), we have 2\\z — uj\\ 2 /(1 — \ \z\\ 2 ) < ( 2 which implies 



z-lu\\<C/V2. (5.26) 



Consider the two cases: 
Case A: m T z > m T uj 
In this case, we have 



rp m T u) ~||a;| 
m C> , , , , > O-rr-r 



\z\\ \\z\ 



Case B: m T z < m T uJ 

In this case, < mFuj — m T z = m T (uj — z) < \ \uj — z\ \ < (/V2 gives 



T 

i i 

m c> > 



rp . m UJ 





\uj\ 




C/V2 




\\z\ 





Now consider two other cases: 
Case I: ||o;|| > ||^|| 

In this case, we have 



\u\ 



> I. 



Case II: ||o;|| < ||z|| 

In this case, (j5.26J) gives 



a; > \ \z\ \ — 



Examining all four combinations of the above cases: 
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Case AI: 

m T c > S\\u\\/\\z\\ > 5 

Case All: 



t . K\\A\-C/^) r k/V2 



m c > r-j — j-: = 5 — 

\\z\\ \\z 



so that as long as ( < \\z\\/y/2, we have m T c > 5/2; 
Case BI: 



t *IMI C/V2^~ C/V2 
m c> — j-j — j-j j-j— j— > o n— rr 



so that as long as C < H^H^/V^) we have m T c > 8/2; 
Case BII: 



m T c > "VII^II-S/v^ _ wv. = ~ s _ (1 + ^ 





z\\ 




C/V2) C/V2 




\\z\ 




\z\\ 



so that as long as ( < -^Tg\\z\\/V2, we have m T c > 5/2. 

The last case imposes the smallest upper bound on £, which is the upper bound in the 
statement of the lemma. □ 

Assume that the hypotheses of Lemma ESI hold for all m £ M' so that m T c > 5/2 for all 
m £ M' p and some 5 > 0. Suppose the test point c is given to the weak optimization oracle 
which returns k c . Then 



c T k c + e<c T p => c T x < c T p Vx £ K, (5.27) 

so that the left-hand side of (|5.27j) is a valid acceptance criterion (appearing in the algorithm) 
if we are solving the in-biased weak separation problem. For a worst-case analysis, we assume 
that p has distance 5 from the boundary of K. It is convenient to divide this distance into 
three parts such that 5' + e < 5 (see Figure lo~2"l on page 179)1 . The rejection criterion for a test 
vector c is simply the logical negation of the left-hand side of 1)5.27)1 : 



-e< -c T {p-k c ). (5.28) 
Thus, we have a revised version of Lemma fUH 

Lemma 34. Suppose that m £ M' and that c satisfies the rejection criterion (j5.28j) . Let 
a := (p — k c ) — Proj c (p — k c ). If m T c > then m T a > 5' — e. 
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Proof. Case —c(p — k c ) > 0: 

m T a = m T (p - k c ) + [-c T (p - k c )} (m T c) > 5' + = 5' (5.29) 
Else -c T {p - k c ) < 0: 

m T a = m T (p — k c ) + [— c T (p — k c )](m T c) > 5' — e\m T c\ = 5' — e (5.30) 

□ 

Therefore, we set e := 5' /2 so that m T a > d~'/2 in the conclusion of the lemma. Since we can 
assume that p G B(0,R), and since k c G -8(0, R), we have \\p — k c \\ < 2R. Thus, ||a|| < 2R. 
Letting a be the normal vector to the new cutting plane, we have 

rp m T a 572 5' 

m a = -r—r > tttt > tt;- 
||a|| ||a|| 4R 

If we set 5 := 5'/4R, then, as long as the machine precision is sufficiently high so that the error 
in m T c (due to round-off error of c) is less than 5/2, the cutting planes do not accidentally 
slice off any bit of K' . We have assumed that the first normalised analytic center c\ := p/\ \p\ \ 
used in the algorithm satisfies m T c\ > S for all m G M' p . Note we actually have that m T C\ > 8' 
for all m G M' p , because m T p > for all m G M p (Fact IT3*|) . Therefore, it makes sense to set 
5' := 25/5, and thus e := 5' /2 = 5/5. 

5.3.3 Derivation of r 

Now we derive the radius r as a function of R and 5. In light of the previous subsection, 
r is redefined as a lower bound on the maximum radius of a ball that fits inside K' p . 

First, we derive a lower bound 9 on the one-dimensional angle that defines the maximum- 
size hypercircular-based cone (emanating from the origin) that fits inside K' . The bound will 
assume only that K is convex, centered at the origin 0, and contained in -5(0, R). 

To get this lower bound, we need to derive a worst-case scenario for p and K that makes 
K p as small as possible. Suppose p has minimal distance 5 from the boundary of K. Thus, 
the ball B(p,5) intersects K only at one point k* G K. Consider the hyperplane H := {x : 
(p — k*) T x — (p — k*) T k*}; it is tangent to B(p, 5) at k* . No point k in K is on the same side 
of H as p (that is, satisfies (p — k*) T k > (p — k*) T k*), else the line from k to k* would contain 
points in K that intersect B(p, 5) and hence contradict the minimality of the distance from p 
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Figure 5.1: The solid angle 6\ + #2 of the hypercircular-based cone as a function of displacement 
x from center of p. 

to k*. If we let 

K* := B(0, R)f]{x:(p- k*) T x < (p - k*) T k*}, 

then we have shown that K C if*. Let M* be {c 6 S n : c T /c < c T p Vfc G if*}. It follows 
that M* C M p . Finally, we show that if p is centered next to the set C* :— H fl B(0,R), 
the set is as small as possible. Note that C* is a hyperdisc of radius R*, where R* < R. 
Fig. 15.11 defines the angles 9\ and 8 2 as a function of the displacement x of p from the center 
of C*, for x G [0, R*]. For a lower bound on M* we want to minimise the sum 81 + 62- Since 
d9i/dx < 882/ dx, this sum is minimised at x — 0, that is, when p is centered next to (7*. As 
well, the value of R* that minimises the sum is R* = R. Define M* 1 with respect to K* just 
as was defined with respect to K. Since 

m;' c m; c m p , 

calculating a lower bound on the size of M*' is sufficient. Below, instead of working with K* 
explicitly, we assume the worst case where K is K* with R* = R and p centered next to C*. 




Figure 5.2: Derivation of solid angle 29 of hypercircular-based cone in terms of R and 5. 
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The angle can be seen in Fig. 15.21 We have 

Y 



so that 



or 



tan 



R + e 

ft 6 ' 
cosy = — 

X 

5' 

25' = X + Y = ( y R + e)tan6 + 



cos 6 



tan0 = - (2-l/cos0) (5.31) 

R + e 



5' 

sin0 = - (2cos0-l). (5.32) 

R + e 




Origin 



Figure 5.3: Derivation of radius r function of 9. 

Now, we derive r as a lower bound on the maximum radius of a ball that fits inside the 
hypercircular-based cone defined by 9. From Fig. 15. 3[ we have 

.'tt 6 
sin tan I — 



4 2 

Since tan(i; — ip) — (tan t> — tan ip)/(l + tan t> tan -0) 



. fl l-tan(fl/2) 

sm0 . , . . 5.33 

l+tan(0/2) V 7 
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As 5 — > (and hence 5', e — > 0), equation ()5.31|) tends to tan6> = 5'/R and equation ()5.33|) 
tends to r = 25 /5R. For convenience of exposition, we use the approximation r m 25 /5R. In 
practice, equation (|5.33jl (in conjunction with a numerical solution for 0) may be used in the 
derivation of Stopping Conditions 1 and 2. 



5.3.4 Newton Iterates 

The next theorem says that, with respect to F, the new (actual) analytic center and 
the old (approximate) analytic center are never too far apart, so that the Newton procedure 
for finding the new approximate analytic center terminates quickly (see the Appendix for a 
proof). 



Theorem 35 (Theorem 20 in 91]). There exists some constant such that any time 
a hyperplane is discarded in Subcase 1.1, F&(z) — F^(u^) < C&. Likewise, there exists some 
constant C a such that any time a hyperplane is added in Case 2, F a (z) — F a (u a ) < C a . 

In Subcase 1.1 or Case 2, to calculate new approximations z ncw to the new analytic center 
Wnew, We perform damped Newton iterations, as denned in Q, starting at the old approximate 
analytic center z. Denote the sequence of ensuing Newton iterates by {xi : i — 0, 1, . . .}. The 
starting point is xq := z. Define A* := 2 — y/3 = 0.2679.... For % > 0, define the Newton 
iterates as: 



x l+1 := x t - q l {V 2 F(x l )Y l VF(x l 

where 



(l + A(x J ))- 1 if X(xi) > X, , 
1 if X(xi) < A*. 

Theorem 2.2.3 in 0| shows that, in the first stage of the Newton process (A(x.j) > A*), 
the difference F(xi) — F(x i+ i) is at least A* and, in the second stage of the Newton process 
(A(xj) < A*), X(x i+ i) < X(xi)/2. Thus, Theorem 1331 savs that, within 0(1) iterations, the 
value of X(xi) will start decreasing quadratically. The total number of Newton iterations 
required is no more than 

[Cd/A*] + [log 2 (A*/p )l, 

in Subcase 1.1, and 



[Ca/A*] + Rog 2 (A*/po)l, 
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in Case 2. 

5.3.5 Selecting the Constants 

Finally, we summarise the values of all the parameters of the algorithm and give values of 
the constants that work in general and for some special cases. 
The parameters have been defined as follows: 

r := 2S/5R 

u ■= 21og 2 (n) + log 2 (l/r) 
5 ■= S'/2R = 5/5R. 

For the constants, we have to summarise the strongest conditions that the convergence 
analysis placed on them: 

AO) < 1/3 
7 < 1/3 

C < 0.02 [see proof of Theorem EU Case 2] 
d > 

c 2 > 

C3 < 1/3 [to invoke Lemma l2*T] 
C 4 < 0.615 

C5 < 1 [to invoke Lemma |2*U] 
C 6 > 

3 + (log 2 (12) + l/2)/2 < X - (z/log 2 (l + C 2 ) - log 2 (z/)) . 

The following list of values can be shown to satisfy the above constraints: 

p := 0.001 
Co := q P0 = 0.00101 
7o := 0.25 
do := 0.08 
v := 1078. 
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The potentially smallest upper bound imposed on C is 

in Lemma ESI We now show that this upper bound is never so small as to require an un- 
reasonable number of Newton iterates, by deriving a lower bound on ||z|| based on Stopping 
Condition 2. While Stopping Condition 2 is not satisfied, we have 

2r < (ajz - bj){Zvnu + 4)/(l - Co) Vj, 

thus, in particular, for j = 1, 



T 2r(l-Co 
\z\ \ > a? z > 



1 A ' 

?>vnu + 4 

Thus the lowest upper bound ever imposed on C will be 

_^^r(l-C,) 3 
1 + 5 3z/nw + 4 v 7 

Let t be the righthand side of the above inequality; note that t is lower-bounded by a polyno- 
mial in i and -|. This gives a tight, worst-case upper bound on p of t — t 2 + 1 3 /3 which is still 
a polynomial in ~ and -|. Thus, in the worst case, the required number of Newton iterates is 
C(polylog(n, f ))• 

5.3.6 Tighter Stopping Conditions 

The upper bound (h + 2)/2 on (1 — ||c<j|| 2 ) _1 in (|5.15|) is not tight because it throws away 
the entire summation in (j5.12|) . Line ()5.17|) gives 

IN < \\z\\ + Co VW(v^F(i)p), 

where A max ((V 2 -F(z)) _1 ) is the largest eigenvalue of (V 2 F(z))~ 1 ] which gives 

:= (i- (m\+ (oV^ii^nz))- 1 )) 2 ) ■ 



'l-llo; 



|2\— 1 



Recalling Lemma 124] Stopping Condition 2 can be immediately tightened to 

2r> [min *\ aIz - k} \ h + 4uj(z)). 
1 — Co 
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To tighten Stopping Condition 1, we go back to line (|5.14j) . which gives 

{x - uj) t V 2 F(uj)(x -uj)<h 2 + h{Sw{z) - 1) - 16w(z) + 32w(z) 2 + 2. 
In conjunction with the proof of Lemma |2*%1 we get 

P C E(V 2 F(z) jZ ,$') 

where 



In I h 2 + h(8w(z) - I) - IQzu(z) + 32w(z) 2 + 2 
* :=W2( 7Y^- 2 + C oi 



Using this and Theorem I23 line (|A.26|) becomes 



{2-d') n fr 



2 [log 2 ((2.5)(l+C 2 )"-i)]/2-n/2 \ n 

2$' r 



< 



2[log 2 ((2.5)(l+C 2 ) h - 1 )]/2n-l/2 „ 

log 2 (2#) - [log 2 ((2.5)(l + C 2 ) h - l )/2n - 1/2] < log 2 (r/n), 
to give the stopping condition 

h > ] n \ [2nlog 2 (2n^'/r) + n] + log 2 (4/5). 

log 2 (l + C 2 ) 

Employing these dynamic stopping conditions ensures that the number of calls to the 
WOPT oracle is minimised. When WOPT is NP-hard, as in the quantum separability prob- 
lem, this is important in practice. 



5.4 Complexity and Discussion 



As in jsij, we only ever have to compute a 1 n + 1 of the (Ji(z) values, regardless of the 
number of hyperplanes h. 

Theorem 36 (Theorem 21 in j9l| ). In Case 1, if there is at least one hyperplane satisfying 
the conditions of Subcase 1.1, then we must discover such a hyperplane in at most a^n + 1 
evaluations of the o~j(z) values. 



Proof. See □ 

The total arithmetic complexity of the al gori thm is 0((T + n 3 \og(R/5))n\og 2 (nR/8)), 
where T is the cost of one call to WOPT. See ^ll for a detailed discussion of the arithmetic 
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complexity of the algorithm, including the complexity of calculating the inverse Hessians. 

Note that - in the worst case - the algorithm requires more machine precision than the al- 
gorithm in 0| due to ()5.35jl . However, I conjecture that, in the vast majority of instances, the 
magnitude \\z\ \ of the approximate analytic center remains larger than a constant; hence the 
algorithm, which incorporates the dynamic bound ()5.34j) . does not require excessive precision. 
Some evidence for this conjecture is based on the following result: 

Fact 37. If bi = for all i, that is, if all cuts are central (through the origin, in our case), 
then, when a cut is added, the new analytic centre u a is always bigger than the old analytic 
centre u; that is, ||^ a || > I M |- 

Proof. From left-multiplying VF(u) = by uj t , we get 

\j\ l£^! = fc/2. (5.36) 

2 ^ ajcu -bi 2 ^ aju 
i=i 1 1 i=i 1 







12 


1 - 




00 \ 


12 



Similarly, we get 





\0u 


a 1 


12 


1 - 






2 



{h + l)/2. (5.37) 



Since the quantity ||o;|| 2 /(l — ||w|| 2 ) increases as ||a>|| increases (and similarly for ||w a ||), we 
have | |u; a | | > | \u>\ |. □ 

Thus, if all the cutting planes go through the origin (all bj = 0), then the analytic center of 
P grows in magnitude with each additional cutting plane. Since the shifts bj of the cutting 
planes tend to zero as the algorithm proceeds (because the eigenvalues of (V 2 F(z))~ l tend 
to zero), the behaviour of the analytic center tends to the case of all cutting planes going 
through the origin. If the requirement for shallow cutting in J^jJ could be removed somehow, 
then our algorithm would be free of this worst case. It is an open problem whether there 
exists a polynomial-time, analytic centre algorithm for the convex feasibility problem that 
does not require shallow cutting. 

With respect to the number of calls to WOPT, how might the algorithm compare with 
the unmodified analytic centre algorithm of 0] applied to Q p and (a weakened) SEPg p (as 
outlined in Section EQjj) ? 1 The new cut-generation rule elegantly combines the routine SEP/^* 
and the constraint {c : p T c > 1}. 

Problem 9. Analyse, and compare more carefully, the two cut-generation rules. 

I have shown that the Atkinson- Vaidya algorithm works with an initial bounding sphere, 
in place of a hyperbox. This result means that we can use this modified algorithm with SEP^* 
1 Actually, rinding any point c such that tc G Q p , for t > 0, suffices. 
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and initial outer approximation equal to 

P = B(0,R*)n{c:p T c> 1}, (5.38) 

where R* is the radius of the smallest origin-centred ball that contains the polar K*. Note 
that such a radius is available when the original set K is known to contain a ball of radius tq] 
in which case, R* = l/r . In the quantum separability problem, we have such a radius, given 
by the maximum separable ball centred at the maximally mixed state (see Section rO|) . Fact 
EDgives the analytic centre of Pq (note that if a hyperbox was used instead, the analytic centre 
may not be easily computable because of a lack of symmetry e.g. the centre is not necessarily 
a scalar multiple of p). This likely makes the standard method more efficient. Using this 
method with deep cuts may, in practice, yield the fastest fully polynomial algorithm. 



5.5 Application to Quantum Separability Problem 

We can handle two scenarios - one experimental, as described in Section 14. 3| and the 
other theoretical. In the theoretical scenario, we assume that we know the density matrix 
for the given state p G T>m,Ni but we do not know whether p is separable; knowing the 
density matrix corresponds to having gathered all M 2 N 2 — 1 independent expected values of 
the physical state p in the experimental scenario. Since the algorithm finds an entanglement 
witness when p is entangled, it could also be applied when p is known to be entangled but 
an entanglement witness for p is desired (though one may want to apply the entanglement 
witness optimization procedure j47] to the result of the algorithm, as our algorithm does not 
output optimal entanglement witnesses). 

We characterise all potential entanglement witnesses for p by 

W = {A G M M:N : tv(A) = 0, ti(A 2 ) < 1}. (5.39) 

For entangled p, define W p to be the subset of W consisting of (right) entanglement witnesses 
that detect p; if p is separable, then define W p to be empty. 

Let B = {Xi : i = 0, 1, . . . , M 2 N 2 — 1} be a basis as described in the beginning of 
Section 12.31 Let j be the number of nontrivial expected values of p that are known, 2 < j < 
M 2 N 2 — 1; that is, (without loss) assume we know the expected values of the elements of 
B' = {X\,X2, . . . ,Xj}. The algorithm either finds an entanglement witness in span(£>') for 
p, or it concludes that no such witness exists. 

For any Y G Hm.jv with Y = X^o 1 Vi-^-h ^ Y be the j-dimensional vector of the real 
numbers yi for i = 1,2, . . . , j . Conversely, for any Y G W with elements yi,...,yj, let 
Y = ^2 3 i=1 yiXi. Define the hyperplanes iry b = {x E W : Y x = b} and halfspaces Hy b = 
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7 1 

{x G W : Y x < b} similarly to before. Define 

Sm,n = {o 7 : (J G S MtN }. (5.40) 

Note Sm,n is a convex set in W containing the origin. Define 

W = (A G R j : A G VV} (5.41) 
W p = (AeW : Ae W p }. (5.42) 

Figure EH shows a schematic of the sets Sm,n, Sm,n, VV, and W p . 

The algorithm solves the following problem: 
Entanglement Witness Problem. Given the expected values of elements of B' for p G 
T>m,n and a precision parameter 5 > 0, either assert 

"p G Sm,n "'■ there exists a separable state a 

such that | \p— a\ \ < 5; or return 
A G VV: an operator such that 

b*(A) < tr(Ap). 

Note when j = M 2 N 2 — 1, this problem solves the separability problem. 

To reconcile the elements of the algorithm with the physics, note the following main 
correspondences: 

n ~ j 
K ~ Sm,n 

K p ~W p {=W p ii j = M 2 N 2 -1). 

To use this algorithm for the separability problem, it remains to show that there exists an 
appropriate centre c and outer radius R of Sm n- The maximally mixed quantum state Imn = 
1/MN is properly contained in Sm,n 0, 13' ^ nus c o is the 0- vector in W corresponding to 
Im,n'- 

"o 






c = (tr(JQ/ M iV)) i=1 ,., 



(5.43) 



We can calculate an upper bound R on the radius of the smallest co-centered hypersphere 
that contains Sm,n by referring to Fig. 15.41 Because the space is Euclidean, we have R = 

y/l ~ 1/MN. 
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For the quantum separability problem, we can derive a slightly better lower bound r than 
we could for the generic case in Section I5.H.H1 This is easily done by making use of the radius 
r$ = 1/ \/MN(MN — 1) of the largest separable hyperball centered at Im,n contained in 
Sm,Ni which is derived in Q]. Using r$, we no longer need to resort to using R* = R; rather, 
we can use R* = ^fW — r|, as is easily seen from Fig. 15.41 The result is 

„l-tan(0/2) 

r = sinfl K -f-i, 5.44 

l + tan(0/2)' v ' 

where 

sin#=^J£^ (2cos0-l). (5.45) 

y/R 2 - r| + 5/5 

5.6 Closing remarks 

This chapter has given a new oracle-polynomial-time algorithm for WSEP(K) relative to 
an oracle for WOPT(JiT), for any convex K C lR n that properly contains a known point Co 
and is contained in a ball of finite known radius R. The novelty of the algorithm lies in the 
way in which the oracle is used to generate cutting planes for the feasibility problem for part 
of the polar of K. This new cut-generation method is based on an intuitive search heuristic 
(see Section |4~4]) . 

The new algorithm is based heavily on the cutting-plane algorithm of Atkinson and Vaidya 
j9l"l ]. which uses shallow cutting. I mentioned that it is an open problem whether there exists 
a polynomial-time, analytic centre algorithm for the convex feasibility problem that does not 
require shallow cutting. Actually, it has been suggested by John E. Mitchell that central 
cutting can be used in the Atkinson- Vaidya algorithm, if certain techniques from [98( are 
employed to compute the new analytic centre. However, from correspondence with Mitchell 
and Yinyu Ye, it is unclear whether modifying the Atkinson- Vaidya algorithm in this way 
retains the polynomial-time convergence: while it is clear that a new analytic centre can be 
efficiently computed when central cuts are used, it is not clear that all the other delicate 
machinery in the convergence argument emerges unscathed. 

Problem 10. Can the Atkinson- Vaidya algorithm be modified so as not to require shallow 
cutting, while still being polynomial-time? 

Combined with the material in Chapter EJ I have shown that the new algorithm has a real 
potential with regard to practical application to solving the quantum separability problem, 
both in theoretical and experimental contexts - the general strategy of reducing the problem to 
the optimisation problem is the best known (see Section l4~£J) . In the case where M = N = 2, 
Ben Travaglione and I developed a working implementation of the algorithm, using Hansen's 
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interval-analysis global optimisation routine 0, 93]. This implementation runs fast enough 
that it could be used in practice. However, for higher dimensions (which are of greater interest 
and where precision and round-off considerations become extremely important), we were not 
satisfied that the implementation is optimal. Currently, Donny Cheung and I are working 
on a more robust implementation, which will hopefully run reasonably quickly in the case 
M = N = 3; that is, the run time of the program on a separable state is hopefully on the 
order of hours or days, as opposed to years. 
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Figure 5.4: Schematic diagram (not to scale) of density operators and W in R n , n = M 2 N 2 . 
The {X\) — {Xj} plane is a two-dimensional representation of the space span({Xi, X2, ...,Xj}). The 
large dashed circle represents the origin-centered unit hypersphere in M n . The upper shaded ellipse 
represents the (n — l)-dimensional hyperball of radius R centered at the maximally mixed state 
Imn which is the intersection of the hyperplane ttj i and the origin-centered unit hyperball in M n . 
The density operators are the heavy-outlined region in this (n — l)-dimensional hyperball; the inner 
heavy outlined shape represents the separable states Sm,n- The boundary of the maximal separable 
hyperball of radius r$ centered at Imn is shown as a dashed ellipse. An entangled state p is shown, 
and its projection ~p is also shown. The shaded elliptical disk (in heavy outline) in the {X\) — (Xj) 
plane is W (i.e. is a representation of the origin-centered (n — l)-dimensional unit hyperball); the 
darker shaded wedge is W p . The boundary of Sm,n is shown as a dashed line. 



Appendix A 



Convergence proofs 



Lemma l25l (Lemma 17 in |9l|). Let z be an approximation to cu such thatu E E(V 2 F(z), z, (). 

(a T z-/3) 2 



Suppose the hyperplane (a, (3) is added in Case 2 with 7o = 7 2 = ° Tr^rim - • Then, 



\a T (z — uj J 



(c) ^ a (cj) < Y := 7 



1 ^ 2 



Proof, (a) Since u G E(W 2 F(z), z, (), we know that 

\a T (z -u)\< C^Ja T {V 2 F(z)y i a. (A.l) 

Therefore, 



a T z — (3 a T z — (3 

(b) By flUH), we get 



T (z - W )| < Cy^V 2 ^))-^ = ^_ (A 2) 



|a T (z-cu)| C\A T ( v2i ^))~ la 



a T a; — /3 a T z — (3 + a T (u — z) 

< Cv^W 1 ^^))-^ = C7 fA3) 

" a T z - P - Cy/a T {V 2 F{z))-^a 1 - CY 



91 
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(c)We have 

^ 2 _ a T {V 2 F{z)Y l a 



a T z - (3) 2 
(V; 

(a T z - f3) 2 



^a T (V 2 F(uo)Y l a n _ 
> (1 - C) 7 r , N Lemma [Qj 



(i-C) 



2 



2 a T (V 2 F(cu))- 1 a/(a T cu-/3) 

\2 



(l + a T (,?-^)/(a T cu -/?))' 

> (1 - C) 2 ^ ' '-ji 1(7 ^ Txo ^ [ fr ° m P alt ( b )] 



,a T ( V 2 F(^))- 1 a/(a r cu- (3) 2 
(1 + C7/(1"C7)) 2 

(i-C) 2 



(1 + c 7 /(i - C7)) 



VF a (a;) T (V 2 F(a;))- 1 VF a (a;) 



(i + C7/(i — Ct)) ' 



\2 

" v- 



where the second-last line follows from VF a (w) = VF(w) — a/{a T ui — (3) = —a/(a T u — (3), 
and the last line follows by noting that V 2 -F a (x) = V 2 F(x) + aa T /(a T x — (3) 2 and applying 
Lemma ITTH Thus, 

7 2 > , ^7,P\ ,M «>) ( A -4) 



U + C7/U-C7)) 



^ M - 72 (i-C7) 2 d-C) 2=f - 

□ □ 

Lemma 1261 (Lemma 18 in |9ll |). Suppose a hyperplane is added in Case 2, and the analytic 
center moves from uj to uj & . Let 7 = a/ a T ( V 2 F(z))~ 1 a/(a T z — f3) 2 . If '7 < |, then 

a T {V 2 F{z))^a l-( 



(a^ a -/3) 2 " ' Vl + 7g7/(l"0 + C7 
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Proof. We have 



a 1 \ & - (3 = a T z - (3 + a T (z a - cj a + cj a - u + u - z) 

< a T z - (3 + |a T (z a — cj a )| + |a T (u; a — + |a T (u; - z) 



<a T z-(3 + Cv / « T (V 2 F a (^))- 1 a + g^ v / « T (V 2 FaH)- 1 t 



by 15,171 and Lemma |l6l 

+ Cv / a T (V 2 F(z))~ 1 a 



by 



by 15,171 and Lemma ^lGI 

< a T z - (3 + Cv / « T (V 2 F a (z a ))~ 1 a + ^^(V 2 ^^))- 



by Lemma |l9l 



+C V / « T (V 2 F( 2 ))- 1 a 
< a T ,z - + t{a T z a - ^ +g^ v /a r (V 2 F(a;))- 1 a 

by Lemma 1 171 

+(^a T (V 2 F(z))- 1 a. 



And so, 



a z 



a 



g < 1 ( 1 , ^(V^M)- 1 ** | Cy/a T (V*F(z))-i 



a T z-(3 ~ 1 - C 



a T z — (3 



a T z — (3 



( 



< 



i-C 
1 



1 , fly v / a T (V 2 F(^))- 1 a | Cy/a T (V*F(z))-ig 
1 — C aT2; — /3 a T z — 



\ by Lemmas 1 1 8 1 and 1 1 91 

l + 7 ^ 7 + C7)- 1 + 7 ^" c + C7 . 



/ 



i-CV i-C 



It now follows that 



a T {V 2 F{z))~ l a a T {V 2 F (z))~ l a {a 1 'z - (3f 



{a T z & - (3) 2 



(a T z-/3) 2 (a T z a -(3) 2 



> 



1 



i-C 



as stated in the lemma. 



1 +797/(1 -0 + C7 
□ 



(A.5) 



(A.6) 



□ 



Theorem l27l (Approximation version of Theorem 13 in 911] ) . Suppose that maxi<j<fc Hi{z) < 
2 at the beginning of an iteration, i.e. Case 2 is about to occur. If the current search space P 
is determined by h hyperplanes (in addition to the unit hypersphere) , then 



det(V 2 F( 2 )) > 2~ n (l + C 2 ) h = 2^ 1+c ^ h ~ n , 



(A.7) 
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for some positive constant C2 which depends on the parameters <7o and 70 of the algorithm and 
the "minimal goodness" (q of the approximation to the analytic centers. This can be improved 
to 

det(V 2 F(z)) > 2" n (2.5)(l + C 2 ) h ~ x . (A.8) 



Proof. Let {(ai, b\), . . . , (ah,bh)} be the set of hyperplanes describing P. For each i, let Sj 
be the number of the most recent iteration in which n(ai, bi) was changed. Without loss of 
generality, we assume that si < s 2 < ... < Sh- Let Fq(x) be our self-concordant barrier 
function over the hyperball alone: 

Construct a set of auxiliary matrices as follows: 



M :=2J, M t -.= M i _ l + T -^— 1 i = l,...h. (A.10) 

The matrix Mq is V 2 -Fo(0). The Mj's add in the terms corresponding to with the 

current settings of K^a^bi). 

Let z(sk) represent the approximate analytic center at the beginning of iteration s^. At the 
beginning of iteration Sk, the n values corresponding to constraints in the set {(ai, 61), . . . , (a^-i, bk-\)} 
have already experienced their final change up to the time of the statement of the theorem. 
Because n(ak, bk) changes in the iteration Sk, iteration Sk must be an occurrence of either 
Subcase 1.2 or Case 2. If it is an occurrence of Case 2, then we easily see that 

a[z(s k ) - bi <2n(a,i,bi), % = 1, . . . , k - 1, (A.ll) 

for otherwise Case 2 would not occur at all. Inequality (jA.ll|) also holds true, however, if 
iteration Sk is an occurrence of Subcase 1.2, for the following reason: Suppose that, for some 
1 in {l,...k — 1}, ajz(sk) — h > 2K(at,bi). Notice that Subcase 1.2 does not affect the 
approximate analytic center at all, and no plane is in line to be discarded, else Subcase 1.1 
would occur instead of Subcase 1.2. As a result, instances of Subcase 1.2 continue to occur 
until K,(a£,bt) becomes reset. This contradicts the assumption that, at iteration s^, the tth 
K-value has experienced its last change. So, regardless of whether iteration Sk is an instance 
of Subcase 1.2 or Case 2, inequality ()A.11|) holds, and, therefore, 

T ,\ —0 > I7Z7T r\v^ i=l,...,fc-l. (A.12) 



T 



(ajz(s k )-hy -4( K ( ai A)) 2; 
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Since the Hessian of the barrier function F takes the form 



fc-i 



{{k)) (1 - z T (s k )z(s k )) 2 + 1 - z T (s k )z(s k ) + ^ (a?z(s fc ) - h) 2 

+ (additional positive semi-definite terms), (A. 13) 

it follows that at the beginning of iteration s k we have 

f ^F Wst )) f >^ + | iI |g F 



> 



4 



^(KCa^fti)) 5 



= ^ T M fc _^, for every £ e R n . (A.14) 

(The two terms corresponding to the hypersphere are minimised when z — 0, which is the 
setting of z in our definition of M .) Thus, by Lemma O 

e(V 2 F(z(s k )))-^ < 4e T M fc "_\e, for every f G R n . 

Iteration is the last time n(a k ,b k ) was changed, and this change occurred in Subcase 
1.2 or Case 2. If the change occurred in Subcase 1.2, then, at the time of the change, 



[aTz(s k ) - b k ) 2 



^ "ft \ v 1 \^\°k "ft n , c .,- r i 

cr < ) T f \ — ~Tv?. — V°y definition of a k ] 



< 4 o|Xj4-i^v = ^n^-p-v 

(alz(s k ) -b k ) 2 (n{a kl b k )) 2 
(n(a k , bk) is the newly reset value) and we conclude that 

4^4^? > T- (A.15) 
(/t(a fc ,6 fe )) 2 4 

If the change occurred in Case 2, then the argument is harder. We employ the notation 
(a, (3) to refer to the hyperplane added in Case 2, just as we did in the algorithm itself. By 
Lemma I2E1 if 7 < §, then 

« T (VF«. t )))->„ 2 / i-c y 



a T z,-/3¥ " Vl + 7?,/(l-C)+C7 
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Thus, since we set (3 to make 7 2 = 7 q 



> Gl := 7 2 ( y- 7 ) , (A.17) 



(a^ a -/3) 2 " ' u Vl+7og fo /(l-Co) + C7o 

where 70 := 7 q (1/ (1 - Co7o)) 2 (1/ (1 - Co)) 2 , consequently, 

a r (V 2 F( Z ( 8fc )))- 1 a fl r (M fc _ 1 )- 1 a _ a r (M fc _ 1 )- 1 a 
(a^ a -/5) 2 " (a^ a -/3) 2 4 (K(a,/3)) 2 ' 

and it follows that 

Regardless of whether the change of n(ak,bk) occurs in Subcase 1.2 or in Case 2, from 
(|A.15J) and (|A.19J) we can assert that 

a ) {Mk -f'T >C 2 :=] min{a , C,}, for I < k < h. (A.20) 

In fact, since we add the first cutting plane (ax, 0) "manually", we know that, for k — 1, 

af(M )-V 1/2 



^(ax^i)) 2 1/3 



3/2 = 1.5, (A.21) 



where 1.5 may be larger than the largest C2 we can achieve. 

Since each Mi, i = 1, . . . , h is symmetric positive definite, we have, for % > 1, 

det(M i ) = det(Af<_i +a i af/(«(a ij ^)) 2 ) 

. dWd e t ( J+ ^-:sy- irl/2 ) 

For an arbitrary vector u e R n , the operator (I±vv T ) has set of eigenvalues {1, 1, . . . , l±v T v}. 
Thus, for i > 1, 



det(M ( ) = detfM,.,) (l + ff-''"" ) 
> det(M i _ 1 )(l + C 2 ). 
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Therefore, we have 

det(M ft ) > (det M )(l + C 2 ) h (A.22) 
= 2 n (l + C 2 ) h . (A.23) 

The hypotheses of the theorem state that Case 2 is about to occur. This means that ()A.11|) 
and (|A.12J) hold at the current z and for h > 1. Likewise, (|A.14|) is true with the current z 
in place of z(sk) and with h in place of k — 1, i.e., we have £ T V 2 F(2:)£ > ^ T M h ^ for all £. 
Thus, 

det(V 2 F(z))>^det(M,), 

and so 

det(V 2 F(z)) > 1 . 2™ ■ (1 + C 2 ) h = 2-"(l + C 2 ) h , 
which proves the theorem. Alternatively, taking potential advantage of (|A.21I) . 

det(V 2 F(^)) > 2^ n (2.5)(l + C 2 ) h ^. (A.24) 



□ □ 

LemmaEH (Lemma 19 in 0)- 

For the approximate analytic center z withu G E{V 2 F(z), z, (), 



we have 



P C E(X7 2 F(z),z,$), 



where 
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Proof. For every x € P, 

(x - z) T V 2 F(z)(x - z) 
= (x — uj + uj — z) T V 2 F(z){x — UJ + uj — z) 

= (x - uj) t V 2 F(z)(x -u) + (u- z) t V 2 F(z)(uj -z) + 2(x - uj) t V 2 F(z)(uj - z) 

< 2(x - uj) t V 2 F(z)(x -u) + 2(uj - z) t V 2 F(z)(uj - z), [since 2a ■ b < a ■ a + b ■ b] 

\7 2 F(uj) 

< 2{x — LU ) T j^ — ( x — CJ ) + ^C 2 [by Lemma ITH1 and hypothesis]. 

The lemma follows by substituting the conclusion of Lemma l23l for (x — uj) t V 2 F{uj)(x — 
uj). □ □ 



Theorem 1291 (Approximation version of Theorem 14 in |9l|]). There exists a constant 
v, independent of h, n, R, and 5, and there exists a function u(n,S) E 0(poly(n, log( j))) 
such that if h = vnu{n,8), then the volume of K p is sufficiently small so as to assert that 
P eS(K,S). 



Proof. The volume of an ellipsoid E(A,z,r) is upper-bounded by r n 2 n / a/ det A Thus, 
from Lemma I28( 



volume(P) < vo\ume(E(V 2 F(z),z,6h)) < 



v/detV 2 F(z) 



To prove the theorem, the bound in (|5.7j) implies that it suffices to show that there exists v 
and u such that h = vnu implies 

(I2unur < rr_y ^ 



Met V 2 F(z 
Theorem strengthens this further to 



<[-) (A.26 



2[log 2 (l+C 2 )]^nji/2-n/2 \ U/ 

\2vnu r . . 

2 [log 2 (1+C 2 )] 1^/2- 1/2 < ~ ( A - Z7 > 

log 2 (12z/nu) - ([log 2 (l + C 2 )}vu/2 - 1/2) < log 2 (r/n) 

log 2 (n/r) + log 2 (n) + log 2 («) + log 2 (12) + 1/2 < vu log 2 (l + C 2 )/2 - log 2 (v) 

log 2 (w/r) loggH logaH , log 2 (12) + 1/2 

1 1 1 < i^iog 2 ( i + 

u u u u 

logoff , . _ _ . 
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Setting 

u := log 2 (n/r) + log 2 (n) = 21og 2 (n) + log 2 (l/r) (A.29) 

and assuming n > 2 gives u > 2 and hence an upper bound on the left side of ()A.28|) of 
1 + 1 + 1 + (log 2 (12) + l/2)/2 < 5.0424. Thus it suffices to find v such that 

5.0424 < \{p\og 2 {l + C 2 ) - log 2 (v)). (A.30) 

Since C 2 is a constant, it suffices that v be constant. Later we will see that r is roughly 8/R, 
thus the theorem is proven. The higher the value of C 2 , the smaller the value v that we need. 
Note that the constant v may be improved (lowered) with knowledge of r and for specific 
(larger) values of n. □ □ 



Lemma |3U] (Lemma 16 in |9X|] ) . Let ( < 1. If u E £(V 2 F(z), z, (), then for all i, 
l<i<h, 

°M < (A.31) 



Proof. From Lemmas and El we know that for all £ G R r 
f 



(i-C) 2 

Therefore, 



e(v 2 F( Z )r^ > e(v 2 F(u))-^ > (i - crHv 2 ^))-^. 



qf(V 2 F(a;))- 1 a, f af (V 2 F(,))^a t 

o-,; a; = — r-^ — < 



(afa;- ,) 2 " (1 - C) 2 (afu - hf 
1 aj (V 2 F(^))- 1 a t /(a^ - o t ) 2 
" (f - C) 2 [(afu; - b % )/(ajz - b % )f 

1 af(V 2 F(^))- 1 a t /(a^-6 t ) 2 
" (1-C) 2 [l + aj(u-z)/(ajz-b l ^ 

_ 1 0j{z) 

= (i-C) 2 [i + af(^-^)/(^-^)] 2 ' 

We know from Lemma [TBI that (,^aj (V 2 F(z))~ 1 ai > \af (u — z)\, and therefore 



(A.32) 



< (^PP < C, (A.33) 
where the second inequality follows since P G E(V 2 F(z), z, 1) (by Lemma 1X7)1 and hence 
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V / af(V 2 F(^))- 1 a i < ajz - b { . It follows that 

1 (Ti(z) (Ji{z) 



aAuo) < 



;i-0 2 (i-0 2 (i-C) 4 

□ □ 

Theorem 1311 (Approximation version of Theorem 11 in j9l| ). There exists a positive 
constant 9, independent of h, n, R, and 5, such that after i iterations of the algorithm, 
N(u) > 9c. The constant 9 will depend on the parameters of the algorithm. 

Proof. The proof is a case analysis following the different cases in the algorithm. 

Case 1, Subcase 1.1: Let F d (x), ^f d (x), and N d (x) denote the functions F(x), ^(x), and N(x) 
resulting after a hyperplane is dropped. We have immediately that 

N d (z)>N(z)+hx2. (A.34) 

This is because by discarding a hyperplane, we have eliminated a term from 1)5.23)1 that is 
known to be less than or equal to — In 2. In this subcase, however, we must consider the effect 
of moving to a new approximate analytic center z d . It works against us that, after the drop, 
N&{oj) > N A {u A ). Our goal, however, is to show that the difference is small relative to the 
guaranteed In 2 increase. 
We know 

T T 

(VF d (u)) T = (VF(u)f +^A-r- = ( A - 35 ) 
v ' a\ uo — bj a -uj — bj 

=o 

and 

V 2 F d M = V 2 FM - T aja \, r (A.36) 

{a'jU-bjY 

Since V 2 F(uj) is symmetric positive definite, it has a square root, and hence 



V 2 F d H = (V 2 F(u;))5 



(V 2 F(u))-ta ja J(V 2 F(u))-t 



(afu - b,f 

x(V 2 F(cj))5. (A.37) 

For an arbitrary vector v G R n , the operator (I—vv T ) has set of eigenvalues {1, 1, . . . , 1— v T v }. 
If v T v < 1 then (J - vv T ) > 0, and for all £ G R n , 



(A.38) 
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We can apply equation ()A.38|) to the inner matrix in (jA.37|) with 

v := (V 2 FM)-^/(a> - b 3 ) 

and with £ := (V 2 F(u;))^x to conclude that for every x R< n > 

(1 - ^H) X T V 2 F(^)x < X T V 2 F d H X < x t v 2 fh x 

and thus by Corollary ITTH for every \ £ R- n 
1 



1 - cr^w 



X T {V 2 F{u))-\ > X T (V 2 F d H)^ X > /(V 2 F( W ))' 



The substitution \ '■= V 2 Fd(a;) gives by equation (|A.35J) 

1 - aj{uj) 

Since in this Subcase 1.1 we have (Tj(z) < cr , Lemma I37H gives 



/ \ crAz) an 



n J ~ (i-C) 4 (i-C) 4 ' 



We then have 



aJu) a /(l-() 



i 



l-a /(l-(Y 



or, 

X d (uj) := v^dH < C 3 :-- 
It thus follows from Lemma I2T1 that 



gb/(l ~ Co) 4 
l-a /(l -Co) 4 ' 



F d (u) - F d (u d ) < C 4 := 1 



2 ° 3 
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and consequently N d (u>) — N d (uj d ) < C 4 . Now, 

N A (uj d ) - N(u) = (N d (oo d ) - N d (u)) + (N d (oo) - N d (z)) 
+ (N d (z)-N(z)) + (N(z)-N(uj)) 

= (N d (cu d ) - N d (u)) + (n(u) - N(z) + In 

+(N d (z)-N(z)) + (N(z)-N(u)) 

= (N d (u d ) - N d (uj)) + (N d (z) - N(z)) + In 

>-C 4 + ln2 + ln(l - aI{u} ~ Z) 



aju) — h, 



ajz — b 



ajuj — b,i 
ajz — bi 



> -C 4 + ln2 + ln 1 



afz - bi 
\afz — bi\ 

> -C 4 + ln2 + ln(l-C) [see fTQ3l) ] 

> -C 4 + In 2 - C - 4( 2 [assuming ( < -] 

> -C 4 + 0.615 [assuming ( < — ]■ (A.44) 

16 

Thus, as long as C 4 < 0.615, the theorem is proven in Subcase 1.1. 

Case 1, Subcase 1.2: The only action in this subcase is a change in «:(aj,6j) for some hy- 
perplane. This change does not affect F or the analytic center u. One of the h terms in 
(J5.23J) was previously less than or equal to — In 2, and now it becomes 0. Let N ncw be the new 
function iV with the newly reset /t(aj,6j). We have N new (z) — N(z) > In 2. Noting that, 

N new (u) - N(u) = (N QCW (u) - N ncw (z)) + (N QCW (z) - N(z)) + (N(z) - N(u)) 
= (F(uj) - F(z)) + (N ncw (z) - N(z)) + (F(z) - F(u)) 
= N ncw (z) - N(z) 

>ln2, (A.45) 
proves the theorem in Subcase 1.2. 

Case 2: Let F a (x), ^^(x), and N^(x) be the functions F(x), ^(x), and N(x) resulting from 
the addition of a hyperplane in Case 2. For notational simplicity, define H := (y 2 F(uj)) 1 / 2 , 



APPENDIX A. CONVERGENCE PROOFS 



103 



d := ||ii"(u; a — co)\\ 2 - Define the function x(t) := uj + t(ou a — uj). We have 



1^-^(^)112 = WH^VF^) - H^VFiu) 



2 







H- L X7 2 F(x(t))(w a -uj)dt\\ 2 
[H- l V 2 F{x{t))H~ l }H{w., - u)dt\\ 2 



o 



[I + A(t)}H(w a -uj)dt\\ 2 

< [ \\[I + A(t)]H(w a -u)\\ 2 dt 
Jo 

< max || J + A(t)\\ 2 \\H(w a -u)\\ 2 



= max||/ + A(t)\\ 2 d. (A.46) 

I + A(t) is simply notation for the identity matrix plus some perturbation of the identity ma- 
trix; notice that H~ l 'V 2 F(x(0))H~ 1 = 1. We need to examine the effects of the perturbation. 
By definition of d, uj a e E(\7 2 F(uj),uj,d). 

We will want to invoke Lemma shortly. Let us first dispense with the case d > 1/4. 
Join u a and uj by a line segment and let x' denote the point on that segment such that 
(x' — uj) t \7 2 F(uj){x' — uj) = (1/4) 2 . Using the Taylor approximation, convexity, and the fact 
that uj minimises F, we conclude 



F(uj a ) - F(uj) > F(x') - F(uj) 

= VF(uj) t (x' -uj) + -(x' - uj) t V 2 F(uj)(x' -uj) + Error 



1 H\ 2 „ 
+ - - + Error 



2 V4 

1 fl/4) 3 

> , ' r [by Lemma E 

~ 32 3(1 - 1/4) 

> 0.02, 



and so N(uj a ) - N(lo) > 0.02. However, 

AT I \ ATI \ 1 ( flT ^ a ~ P\ 1 ( flT ^ a ~ P 

N a (uj a ) - N(uj a ) = - In — - — — = - In 



K,(a,/3) J \c T z a -(3 
>-ln ( 1 + '^ a a I^ )l ) >-ln(l + C) [by(EHD 



> -c, 
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and so 

N a (u a ) - N(u) = N a (u a ) - N(u a ) + N(oo a ) - N(u) > 0.02 - C (A.47) 

for the case in which we assume d > 1/4. Recall £, which measures the quality of the 
approximation of z to uj, is under our control. Up until this point, we have assumed £ < 
1/16 = 0.0625; now, we must assume that ( < 0.02. 

Now consider the case d < 1/4. Since u a G E(V 2 F(u), uj, d) and d < 1, it follows by 
Lemma ITH1 that 

f(I + A(tm = eH^ l V 2 F{x{t))H-^ 

^ J^^fH-^F^H' 1 ^ for every t G [0, 1] 

1 m 

~ (1-d) 2 
for every £ in R n . Therefore, max f 

||J + A(t)|| 2 < 1/(1 - d) 2 . Returning to |Q6jl . we may 

now conclude further that 

||#-VF(ug|| 2 <d/(l-d) 2 . (A.48) 
Consider a different perspective on ||if _1 V-F(u; a )||2. We know 

VF(^ a ) = VF a K) + f = / r (A.49) 

a J w a - a J w a - (3 

where, recall, we select the appropriate (3 when Case 2 occurs. Now, 

.T, . n „T n , T/, . \ , T 



a uo a — [3 = a z — (3 + a (u — z) + a (u a — uj) 

< -J^N^Fiz^a + C^a T (V 2 F{z))- l a 

7 V v ' 



v since a e B(V 2 F(z),z,C) 

by def'n of 7 



+ d^a T {V 2 F{uo))-h 



since o; a e £(V J F(u), uj 



2) 



1 ^(V^Ml^q A V /q^(V^H)^a — — 

< : 7 K~ : ~, h d^a 1 (V 2 F(lu)) 1 a 

7 1 - C 1 ~ C 
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where the last inequality follows from two applications of Lemma ITS1 Using (jA.49|) . 



2 Q 



^a T {V 2 F{uj))- l a 



a T uj & - (3 
1 

1/7(1 -O + C/i 

Combining (|A.48jl and (|A.50|) . we conclude that 



> — — - -. (A.50) 



d -(l/7)+2 + C~ C5: ~(l/7o) + 2 + Co- (A - 51) 



Now invoke Lemma EH again to get 



1/^2 (C5) 3 



tf(a; a ) - N(u) = F(w a ) - F(w) > C 6 := -(C 5 )' - ^ _°' ■ (A.52) 

Thus, as long as Cq > 0, the theorem is proven. □ □ 

Theorem 1321 (Approximation version of Theorem 15 in j9l| ). If the algorithm 
does not first find a separating hyperplane or halt by Stopping Condition 1, then, within 
0(nulog(nuR/5)) iterations, Stopping Condition 2 must be met. If Stopping Condition 2 is 
met, then the set K p is negligibly small and the algorithm may return l p e S(K, 5) " . 

Proof. By Lemma 12*4*1 a stopping condition that says "the width of P is too small to contain 
K p n is 

2r > [mm{aju - bi}](3h + 4). (A.53) 

i 

Because we do not have our hands on uj during the algorithm, condition (|A.53|) is not feasible 
to check directly. Instead, we will use Theorem OH] and show that if F(u) is larger than a 
certain value J 7 , then Stopping Condition 2 is satisfied, which, in turn, implies that (jA.53|) is 
satisfied. 

First, we need an upper and lower bound on afz — bi relative to aju — b^. Suppose that 
z is an approximation to uj with uj e E(V 2 F(z), z, £). For any index i, 1 < i < h, 

I ln(aju - k) - ln(ajz - bi)\ = | ln(l + (aj(u - z))/{afz - b t ))\ 

= I ln(l +t)\, [where |t| < (; see (JX 
< -ln(l-C)- 
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This implies 

(1 - ()(aju - b t ) < ajz -h< (ajou - h)/(l - (). (A.54) 

Assuming C < Co? starting with (jA.53|) we have the series of implications, as promised 
above: 

2r > [mmi{ajuj - bi}](3h + 4) 
<= 2r> H^z^l ( 3 / t + 4) [by (jA~54)) ] 

<= 2r> [mi ^ c ;- b ' }1 (3/^ + 4) [by(H31] 

2r> ^(3/^ + 4) 



for some j 
for some j 
for some j 
-<= for some j 



Mi-Co) 2 ^ _T. , /, 

^ < - ln(ajcj - 6,) 

< — ln(aju; — bj) [since /i < z/nw] 



,2r(l-Co) 2 
3unu+A 



h ^2r(l-Co) 2 

F(u) + ln(l - u T u) > unuln 1 



2r(l-Co) 2 

F(u) > vnu In ( 2 ^""^ 2 ) - ln(l - u T u) 



F(u) > T := z/nw In ( 2 ^"^ 2 ) + In 



where the second line is Stopping Condition 2, and the last line follows from (|5.15j) . 

It remains to show that F(oS) reaches the value T within 0(nu 2 ) iterations. Each K(aj, b^) 
value is the distance of the approximate analytic center, at some iteration, from the ith 
hyperplane. Consider the first distance to a hyperplane, set when the hyperplane is introduced 
in Case 2. By selection of (3, 

a T z-f3 = ^W aT (^ 2F ( z ))~ la = lo 1 max a T (y-z)<% 1 (A.55) 

y&E(V 2 F(z),z,l) 

since E(V 2 F(z), z, 1) C P C B n . In subsequent iterations, uj may drift farther from the 
hyperplane (a, (3). At worst, it can drift to the edge of the unit hyperball B n . Therefore, we 
may safely say that 

«(ai, bi) < % x + 2, for all i. (A.56) 
By Theorem |^ after i iterations, N(u) > 9l for some 9 > 0. That is: 

ft h 

- ln(af w - &i) - ln(l - a/cu) > 0a - In nfa, b t ) (A.57) 

i=l i=l 
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after i iterations. From (j A .57(1 and (|A.56j) . we get 

F(uj)>6L-h\xi(% 1 + 2). (A.58) 

Thus, F is lower-bounded by a function that grows linearly with t. Therefore, F(u) > T if 
the number of iterations i satisfies 

i / 3vnu + 4 \ / 2 + vnu 
6l - vnu ln(7 + 2) > vnu In ( _ 1 + In I 

or, equivalently, 

vnu In (^gs) + In (^) + i/tw ln^ 1 + 2) 



t > 





Again, since we will see that r m 5/R, t £ 0(mi log(mt-R/<5)) iterations suffices. □ □ 



Theorem 1351 (Theorem 20 in 9l|]). There exists some constant C d such that any time 
a hyperplane is discarded in Subcase 1.1, F d (z) — F d (oj d ) < K\. Likewise, there exists some 
constant C a such that any time a hyperplane is added in Case 2, F & (z) — F & (oj a ) < K 2 . 

Proof. We have already seen in (jA.43|) that 

F a {z) - F a (cu a ) < C 4 . (A.59) 

By definition of the approximation, we know that oj £ E(V 2 F(z), z, (). But (oj— z) T V 2 F d (z){oj- 
z) < (oj — z) t V 2 F(z)(oj — z), and thus oj £ E(V 2 F d (z), z, (). Let (aj,bj) denote the discarded 
hyperplane. It follows from Lemma 12*0 that 

F d (z) - F d (oj) = VF d (oj) T (z -oj) + -(z- oj) T V 2 F d (oj)(z - oj) + Error 

= aj j Z - p- + h z - oj) T V 2 F d (oj)(z -u) + Error [see |~05j) ]. 

CLj OJ — Oj Z 

An argument essentially identical to the argument used to prove Lemma l2*oT b) shows that 



\(aj(z-u))/(aju- bj )\< 



(A.60) 
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Therefore, 



aj(z — u) 1 m n 

-V + - (z - oo) T V 2 F d (uj)(z -u) + Error 

a 1 - uj — bj 2 



< - v ^_ + -0 - u,) J V^F d (a,)(z - u) + Error 



< v - + -0 - V 2 F(o;)(z - w) + Error (A.61) 

1 -CV a A z ) 2 



< ^ v/ ^^ + -(^-cj) t V F ^ (z-lu) + Error by Lemma EH 

i-Cv^ 2 ) 2 ( 1_ 



: -v /g ^fL + l^g^ + Error 



,(*) 2 (!-0 2 

l c 2 

1-Cv^ + 2(1-C) 2 + 3(1-C)' 



So we have 



F d (z)-F d (uj)< Cv ^ + - / C ' + , ^ r . (A.63) 
v; v;_ l-CV^o 2 (1 — C) 2 3(1 -C) 



Together, (IA.59I) and (IA.63I1 imply 



F d (z) - F d (co d ) = F d (z) - F d (co) + F d (cu) - F d (u d ) 



< ^ i c 2 c 3 c 

- 2 (1 -C) 2 3(1-0 4 



. r ._ CoV^o 1 Co Co r 



This obviously provides a constant upper bound C d for Subcase 1.1. 
The proof for Case 2 is harder. Rewrite F a (z) — F a {uj a ) as 

F & {z) - F a {u a ) = F & {z) - F & {u) + F a (uj) - F a {u a ) 

We first work on a bound for FJz) — FJu). We have 



z - uj) t V 2 F & (z)(z -lu) = (z- uj) t V 2 F(z)(z - UJ 



a T (z — uj) 



a T z-(3 

< C 2 + CV [by Lemma E^a)] 
= (1+7 2 )C 2 . 



2 
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It follows that uj G E(V 2 F a (z), z, \/l + 7 2 C)- Lemma fTSl then implies that 



(l-v^T^C) 2 -\i-^iT^C J 



which says z G -B(V 2 F a (d;), C 7 ), where C 7 := a/1 + 7oCo/(l — a/1 + 7 2 (o)- The second- 
degree Taylor approximation now gives 

F a (z) - F a (oo) 

= VF & (uo) T (z -u) + -{z - u) T V 2 F a (uo)(z -u) + Error 

= VF a (u) T (z + \{z - u) t V 2 F{uj){z - u) + ~ " + Error 

-(^-c;) t V 2 F(c ( ;)(z-cj) + - — ^ + Error 

a- Lo> — p 
fa 1 

< 



a T aj- /? 2 V ; v yv y 2 V. a T u-(3 



i_-Cj 2 yi-cy 2 V1-C7; t 3(1 -c 7 ) 

Lemma EH 15.171 and Lemma Il8l Lemma 1^51 Lemma EJJ 

. Co7o 1 /Co V, 1 / CoTo \\ C? 

where we note that ()5.17|) and Lemma ITH1 imply z G E(V 2 F(uj),uj, j^) to get the second 
term in the second-last line above. We clearly have established that F a (z) — F a (cu) is bounded 
above by a constant. 

We saw in ()5.19|) that, in Case 2, oj & G £'(V 2 F a (u;), u, q^). Lemma [PHI now implies 

(w a - ^) T (V 2 F a (^ a ))(^ a -u)< (c^ a - ^ T f fa( ^ (^ -u)< 



which implies 

c^G £((V 2 F a (^ a )),^ a ,-^). (A.65) 

1 - 

A second-degree Taylor expansion and Lemma QUI give 

F a (cu) - F a (^ a ) = VF a (cu a f(cu - uo a ) + ^(cu - cu a ) T (V 2 F a (cu a ))(cu - cu a ) + Error 



- o + - 



2 VI -97/ 3(1 - - g^)) 
- 2 \l-q i0 ) 3(l-g f0 /(l-g f0 ))- 1 ' J 
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We put together the results of (jA.64j) and ()A.66|) to get 

F a (z) - F a {u a ) < C a , 

where 

r Co7o 1 ( Co \ 2 1 f Co7o \ 2 C 7 3 

*' l-Co7o 2 Vl -Co/ 2Vl-Co7o7 3(1 - C 7 ) 

We have established that a constant upper bound C a exists for F a (z) — F a (u a ). □ □ 
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